In this notebook, I will try and develop a deep learning model to classify data points from the CIFAR-10 dataset.
I start by installing a few extra libraries which I will use later on in the notebook. The key library here is Tensorflow Addons, which extends Tensorflow and Keras to include implementations of more advanced deep learning advances.
!pip install -q -U keras-tuner tensorflow_addons
!nvidia-smi
Sun Nov 28 09:46:18 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... Off | 00000000:00:04.0 Off | 0 |
| N/A 37C P0 39W / 300W | 15503MiB / 16160MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
%load_ext tensorboard
The tensorboard extension is already loaded. To reload it, use: %reload_ext tensorboard
I begin by importanting essential libraries. These are:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import plotly.graph_objects as go
import tensorflow as tf
import seaborn as sns
sns.set(style="dark")
from tensorflow.keras.datasets.cifar10 import load_data
from tensorflow.keras import Model, Input, Sequential
from tensorflow.keras.layers import Rescaling, Dense, Conv2D, GlobalAveragePooling2D, MaxPool2D, Dropout, BatchNormalization, ReLU, Layer, Reshape, Flatten, Activation, Normalization, Multiply, AveragePooling2D
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard, TerminateOnNaN, CSVLogger
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.regularizers import l2
from tensorflow_addons.activations import mish
from tensorflow_addons.optimizers import SWA
from tensorflow_addons.callbacks import AverageModelCheckpoint
# from tensorflow.keras.applications import efficientnet_v2
import keras_tuner as kt
from functools import partial
from tensorflow.image import random_flip_left_right, random_crop, resize_with_crop_or_pad
from tensorflow.keras.models import load_model
To allow for improved reproducibility, I make sure to set a random seed. 42 is the answer to everything in the universe.
np.random.seed(42)
tf.random.set_seed(42)
Our problem statement is to develop a deep learning model that is capable of classifying images from the CIFAR-10 dataset.
Our goals for the model are to develop a model that is able to generalize well to new data points (that is, does not overfit)
The main reason why I'm setting this goal is that while having an accurate model on the training set is nice, if we want to actually use the model in the real world, it ought to be able to generalize well to new data.
We will begin by loading our data. Since a function to download the dataset is already included in Keras, we will make use of it to quickly load our data.
When training and evaluating our model, we will split our data into a training, validation, and testing set.
The training set will be used to train the model, the validation set for model tuning, and the testing set will be used to evaluate the final model, ensuring that it is able to generalize. (and does not overfit to the validation set as a result of our model tuning)
| Split | Size |
|---|---|
| Training | 40K |
| Validation | 10K |
| Testing | 10k |
In practice, this just means that when training our model, we will use 20% of our training data for validation. We choose a large number of examples for the validation set as we will be making our decisions based on the validation set, so it's important not to overfit to the validation set.
(X_train, y_train), (X_test, y_test) = load_data()
train_size = 40000
X_train, y_train, X_val, y_val = X_train[:train_size], y_train[:train_size], X_train[train_size:], y_train[train_size:]
print("Length of Training Set:", len(X_train))
print("Length of Validation Set:", len(X_val))
print("Length of Testing Set:", len(X_test))
Length of Training Set: 40000 Length of Validation Set: 10000 Length of Testing Set: 10000
Each numbered label in the data set represents a specific article of clothing. To make the labels more readable, we will use a dictionary to map each number to the corresponding description.
class_labels = {
0 : "airplane",
1 : "automobile",
2 : "bird",
3 : "cat",
4 : "deer",
5 : "dog",
6 : "frog",
7 : "horse",
8 : "ship",
9 : "truck"
}
IMG_SIZE = (32, 32, 3)
Before we even start modelling, it's important to get a grip on the data. There are a few key questions to ask here:
X_train[0].shape
(32, 32, 3)
Each image is a RGB 32x32 image. It is important to note that these images are not very large, meaning that the neural network does not need a particularly large receptive field.
Let's take a look at a subset of random images first.
random_idxs = np.random.choice(X_train.shape[0], 10, replace=False)
fig, ax = plt.subplots(2, 5, figsize=(10, 5), tight_layout=True)
for idx, subplot in zip(random_idxs, ax.ravel()):
subplot.axis("off")
subplot.imshow(X_train[idx])
subplot.set_title(f"Label: {class_labels[y_train[idx, 0]]}")
Let's take a look at what each class looks like.
fig, ax = plt.subplots(1, 10, figsize=(20, 15))
for i in range(10):
images = X_train[np.squeeze(y_train == i)][0]
label = class_labels[i]
subplot = ax[i]
subplot.axis("off")
subplot.imshow(images)
subplot.set_title(f"Label: {label}")
fig.show()
fig, ax = plt.subplots(10, 10, figsize=(20, 20))
for i in range(10):
images = X_train[np.squeeze(y_train == i)][:11]
label = class_labels[i]
for j in range(10):
subplot = ax[i, j]
subplot.axis("off")
subplot.imshow(images[j])
subplot.set_title(f"Label: {label}")
fig.show()
We can see that the images are fairly diverse, with different viewing angles. We also note that some of the classes are quite broad. For example, we can see that along with real planes, toy planes are also included in the airplane class.
labels, counts = np.unique(y_train, return_counts=True)
for label, count in zip(labels, counts):
print(f"{class_labels[label]}: {count}")
airplane: 3986 automobile: 3986 bird: 4048 cat: 3984 deer: 4003 dog: 3975 frog: 4020 horse: 4023 ship: 3997 truck: 3978
plt.barh(labels, counts, tick_label=list(class_labels.values()))
<BarContainer object of 10 artists>
We can see that there is an even class balance. This means that we can make use of accuracy as our primary metric, as there is no real "minority" class, so accuracy is a good measure of classification performance.
mean, std = np.mean(X_train, axis=(0, 1, 2)), np.std(X_train, axis=(0, 1, 2))
print("Mean:", mean)
print("std:", std)
Mean: [125.32067661 122.92584397 113.78740571] std: [63.02791301 62.14517894 66.73149283]
These are the average and standard deviation of pixel intensities on each color channel (Red, Blue, Green)
plt.imshow(np.mean(X_train, axis=0) / 255)
<matplotlib.image.AxesImage at 0x7fef89b64350>
fig, ax = plt.subplots(1, 10, figsize=(32, 10))
for idx, subplot in enumerate(ax):
avg_image = np.mean(X_train[np.squeeze(y_train == idx)], axis=0) / 255
subplot.imshow(avg_image)
subplot.set_title(f"{class_labels[idx]}")
subplot.axis("off")
Although the average image is fairly blurry, we can still roughly make out the image for the automobile, horse and truck. It is more difficult to make out the average image for the other classes, and could suggest that these classes may be more difficult to predict.
Before we model the data, it is important to do some basic pre-processing on it.
As they are, the current labels are in a label encoded format. We will one hot encode the labels by using the to_categorical function from the Keras utilities.
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
y_test = to_categorical(y_test)
print(y_train[0])
print("Label:", tf.argmax(y_train[0]))
[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.] Label: tf.Tensor(6, shape=(), dtype=int64)
From the one hot encoded label, we can use argmax to get back the original label in label encoded form.
From prior experimentation, I have found that normalizing the inputs helps in improves the accuracy of the resulting model as it converges faster. This is because the optimization algorithm we will be using, SGD, converges better when the feature scale is approximately the same.
Since we are normalizing the data, the resulting data will be centered around 0 with a standard deviation of 1, and thus we don't need to rescale the image beforehand.
Normalizing the inputs means that we will calculate the mean and standard deviation of the training set, and then apply the formula below
$$ X_{channel} = \frac{X_{channel} - μ_{channel}}{σ_{channel}} $$Note that we prevent data leakage by ensuring we don't use any of the validation/testing data to calculate the per-channel mean and std. This is why we did our split into train-val-test beforehand.
BATCH_SIZE = 128 #@param {type:"number"}
pre_processing_v1 = Normalization()
pre_processing_v1.adapt(X_train) # Calculate the mean and std of the train set
fig, ax = plt.subplots(ncols=2)
ax[0].imshow(X_train[0])
ax[0].set_title('Before Preprocessing')
ax[1].imshow(tf.squeeze(pre_processing_v1(X_train[:1, :, :])))
ax[1].set_title('After Preprocessing')
fig.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
This is what an image looks like after it has been normalized.
Since our goal is to prevent overfitting, we also apply data augmentation. Data augmentation is a method to reduce the variance of a model by providing it with more training data. This is done here via doing random flips and crops to an image, to create variations on that image.
I picked these simple augmentations for the following reasons:
Nevertheless, I will try out a stronger data augmentation method later on as part of my experimentation.
def basic_data_aug(images):
image = random_flip_left_right(images)
image = resize_with_crop_or_pad(image, IMG_SIZE[0] + 4, IMG_SIZE[1] + 4)
image = random_crop(
image, size=IMG_SIZE
)
return image
def set_up_data_aug(aug_func=basic_data_aug):
train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train))
val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val)).shuffle(BATCH_SIZE * 100).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
train_ds = train_ds.map(
lambda x, y : (aug_func(x), y), num_parallel_calls=tf.data.AUTOTUNE
).shuffle(BATCH_SIZE * 100).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
return train_ds, val_ds
train_aug_ds, val_ds = set_up_data_aug()
train_ds, val_ds = set_up_data_aug(lambda x : x) # apply no data aug
image_batch, label_batch = next(iter(train_aug_ds))
plt.figure(figsize=(10, 10))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.title(class_labels[np.argmax(label_batch[i])])
plt.imshow(tf.squeeze(image_batch[i]))
plt.axis("off")
Once we've set up a pre-processing pipeline, we will begin training various models.
For all models, we will train them using stochastic gradient descent. This choice was made as SGD seems to generalize better on tasks like image classification, as compared to other optimizers like Adam. Since our primary focus is on building a robust model, it makes sense to use the SGD optimizer. More specifically, we will use SGD with Momentum.
In addition, I will make use of the Cosine Learning Rate Scheduler with Restarts which comes from the SGD with Warm Restarts paper. A learning rate scheduler decides how the learning rate should be adjusted during training.
This scheduler works by using a cosine wave to adjust the learning rate, and increasing the learning rate after one cycle, to simulate a restart of the training. This has a few advantages, the warm restart allows the model to escape bad local minima (the increased learning rate during a warm restart let's the model "jump" out of a bad minima and find a better one), and in the Snapshot Emsembling paper, it was shown that a copy of the model could be saved before each warm restart to cheaply ensemble together models (train 1, get M for free)
Since each warm restart might cause a temporary degradation in performance, I increase the patience of my Early Stopping to give the model a better chance to recover.
To make the process of model training more standardized, I have created a class which will keep track of the various Models tested during the notebook.
#@title Base Hyperparameters
LR = 0.05 #@param {type:"number"}
momentum = 0.9 #@param {type:"number"}
WEIGHT_DECAY = 0.0005 #@param {type:"number"}
MAX_EPOCHS = 200 #@param {type:"integer"}
VAL_SPLIT = 0.2
base_hparams = {
"val_split" : VAL_SPLIT,
"max_epochs" : MAX_EPOCHS,
"batch_size" : BATCH_SIZE
}
from math import ceil
steps_per_epoch = ceil(len(X_train)/ BATCH_SIZE)
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
I create a simple utility function to plot the loss and accuracy of the model as it trains.
def plot_loss_curve(model_history):
model_history = pd.DataFrame(model_history)
epochs = list(range(1, len(model_history) + 1))
fig = go.Figure()
fig.add_trace(go.Scatter(x=epochs, y=model_history["loss"],
mode='lines+markers',
name='Training Loss'))
fig.add_trace(go.Scatter(x=epochs, y=model_history["val_loss"],
mode='lines+markers',
name='Validation Loss'))
fig.add_trace(go.Scatter(x=epochs, y=model_history["accuracy"],
mode="lines+markers",
name="Training Accuracy"))
fig.add_trace(go.Scatter(x=epochs, y=model_history["val_accuracy"],
mode="lines+markers",
name="Validation Accuracy"))
fig.update_layout(
title="Loss/Acc Plot",
xaxis_title="Epochs",
yaxis_title="Loss/Acc",
)
return fig
To evaluate my model, I created a ModelEvaluator class to keep track of the different experiments conducted and save them offline. My doing this, it means that I can easily record my progress.
class ModelEvaluator:
"""
A class to keep track of experiments made, so that it is easier to keep track of experimentation
"""
def __init__(self, history_path = None, base_savedir = "/content/drive/MyDrive/Data/DELE CA1/CIFAR10"):
if history_path is None:
self.result_history = pd.DataFrame({
"Model Name" : [],
"Epochs" : [],
"Batch Size" : [],
"Train Loss" : [],
"Test Loss" : [],
"Train Acc" : [],
"Test Acc" : [],
"Remarks" : [],
"Model Path" : []
})
else:
self.result_history = pd.read_csv(history_path, sep=";")
self.default_callbacks = [
TerminateOnNaN(),
CSVLogger("/tmp/training.log", append=False)
]
self.base_savedir = base_savedir
def evaluate_model(self, model, training_data = train_ds, validation_data = val_ds, hyperparameters =base_hparams, callbacks = None, plot_loss = True, remarks = "", savedir = None):
"""
Evaluate a model. Assumes the model has already been compiled, so compilation and choice of optimizer must be done beforehand
"""
# Train Model
if callbacks is None:
callbacks = [EarlyStopping(monitor='val_accuracy',patience=20, restore_best_weights=True)]
callbacks = self.default_callbacks + callbacks
name = model.name
validation_split = hyperparameters["val_split"]
epochs = hyperparameters["max_epochs"]
batch_size = hyperparameters["batch_size"]
if savedir is None:
filepath = f"{self.base_savedir}/SavedModels/{name}"
else:
filepath = savedir
print(f"Training {name}")
try:
if validation_data is None:
X_train, y_train = training_data
history = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, validation_split=validation_split,callbacks=callbacks)
else:
history = model.fit(training_data, validation_data=validation_data, epochs=epochs, batch_size=batch_size,callbacks=callbacks)
history = history.history
# print(history)
except KeyboardInterrupt:
history = pd.read_csv("/tmp/training.log")
# print(history)
print("\nHalting Training")
print(f"Saving best model to {filepath}")
if plot_loss:
try:
fig = plot_loss_curve(history)
except:
print("error creating loss curve")
fig = None
else:
fig = None
result = dict()
result["Epochs"] = len(history["loss"])
result["Batch Size"] = batch_size
result["Model Name"] = name
result["Remarks"] = remarks
result["Model Path"] = filepath
# Calculate Statistics
best_val_idx = np.argmax(history["val_accuracy"])
result["Train Loss"] = history["loss"][best_val_idx]
result["Test Loss"] = history["val_loss"][best_val_idx]
result["Train Acc"] = history["accuracy"][best_val_idx]
result["Test Acc"] = history["val_accuracy"][best_val_idx]
result["[Train - Test] Acc"] = result["Train Acc"] - result["Test Acc"]
self.result_history = self.result_history.append(result, ignore_index=True)
tf.keras.backend.clear_session() # clear all previous models from memory
return pd.Series(result), fig
def return_model(self, model_name):
filepath = self.result_history[
self.result_history["Model Name"] == model_name
]["Model Path"]
assert len(filepath) == 1, "There is no model or more than one model with that name!"
filepath = filepath.values[0]
model = tf.keras.models.load_model(filepath)
return model
def return_history(self, include_cols = ['Model Name', 'Train Acc', 'Test Acc', '[Train - Test] Acc', 'Remarks'] ):
return self.result_history[include_cols]
# def return_training_logs(self, model_name):
# raise NotImplementedError
# def return_loss_plot(self, model_name):
# raise NotImplementedError
# logs = return_training_logs(model_name)
# return plot_loss_curve(logs)
def remove_model(self, model_name):
mask = ~(self.result_history["Model Name"] == model_name)
self.result_history = self.result_history[mask]
def add_remarks(self, model_name, comment):
"""
Add comments to a model result
Comments can include:
- Sources of Error
- Notes about model architecture
"""
mask = (self.result_history["Model Name"] == model_name)
assert mask.sum() == 1, "There is no model or more than one model with that name!"
self.result_history.loc[mask, "Remarks"] = comment
def save_history(self, file_name = None):
if file_name is None:
file_name = f"{self.base_savedir}/history.csv"
self.result_history.to_csv(file_name, sep=";", index=False)
print(f"History saved to {file_name}")
evaluator = ModelEvaluator("/content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv")
# evaluator = ModelEvaluator()
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
| 2 | Baseline_CNN_1_DataAug | 0.996625 | 0.8826 | 0.114025 | Data Aug lowers variance, but still overfits |
| 3 | efficientnetv2-s | 0.320450 | 0.3495 | -0.029050 | NaN |
| 4 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.997800 | 0.8739 | 0.123900 | NaN |
| 5 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.999425 | 0.9398 | 0.059625 | NaN |
| 6 | ImprovedWideResNet_28_10_ProperDropout_No_Stoc... | 0.998900 | 0.9456 | 0.053300 | Best model thus far |
| 7 | ImprovedWideResNet_28_10_Dropout_No_Stochastic... | 0.999850 | 0.9170 | 0.082850 | NaN |
| 8 | WideResNet_28_10_Fixed_BasicDataAug | 0.998625 | 0.9333 | 0.065325 | NaN |
| 9 | WideResNet_28_10_Fixed_CutMix | 0.916025 | 0.9391 | -0.023075 | NaN |
| 10 | SEWRN_28_10_Fixed_Cutmix | 0.929250 | 0.9461 | -0.016850 | NaN |
| 11 | SEWRN_28_10_Fixed_Cutmix_Mish | 0.639800 | 0.7739 | -0.134100 | NaN |
| 12 | SEWRN_28_10_SWA_Cutmix | 0.196825 | 0.2231 | -0.026275 | NaN |
assert 1 == 2, "Stop Execution Beyond This Point"
--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) <ipython-input-92-21ab34847f1a> in <module>() ----> 1 assert 1 == 2, "Stop Execution Beyond This Point" AssertionError: Stop Execution Beyond This Point
As a simple baseline, I build a fully connected neural network.
def build_mlp_network(optimizer):
inputs = Input(IMG_SIZE) # Input
x = pre_processing_v1(inputs)
x = Flatten()(x)
x = Dense(128, 'relu')(x) # Hidden Layer
x = Dense(128, 'relu')(x)
x = Dense(128, 'relu')(x)
x = Dense(10, 'softmax')(x)
model = Model(inputs=inputs, outputs=x, name='Baseline_MLP')
model.compile(optimizer=optimizer,loss='categorical_crossentropy', metrics=['accuracy'])
print(model.summary())
return model
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
model = build_mlp_network(optimizer)
Model: "Baseline_MLP" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_2 (InputLayer) [(None, 32, 32, 3)] 0 _________________________________________________________________ normalization_1 (Normalizati (None, 32, 32, 3) 7 _________________________________________________________________ flatten_1 (Flatten) (None, 3072) 0 _________________________________________________________________ dense_4 (Dense) (None, 128) 393344 _________________________________________________________________ dense_5 (Dense) (None, 128) 16512 _________________________________________________________________ dense_6 (Dense) (None, 128) 16512 _________________________________________________________________ dense_7 (Dense) (None, 10) 1290 ================================================================= Total params: 427,665 Trainable params: 427,658 Non-trainable params: 7 _________________________________________________________________ None
results, fig = evaluator.evaluate_model(model)
Training Baseline_MLP Epoch 1/200 313/313 [==============================] - 2s 4ms/step - loss: 1.7810 - accuracy: 0.3686 - val_loss: 1.6387 - val_accuracy: 0.4209 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 2/200 313/313 [==============================] - 1s 4ms/step - loss: 1.5824 - accuracy: 0.4370 - val_loss: 1.5925 - val_accuracy: 0.4474 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 3/200 313/313 [==============================] - 1s 4ms/step - loss: 1.4731 - accuracy: 0.4768 - val_loss: 1.5149 - val_accuracy: 0.4676 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 4/200 313/313 [==============================] - 1s 4ms/step - loss: 1.3658 - accuracy: 0.5118 - val_loss: 1.4915 - val_accuracy: 0.4809 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 5/200 313/313 [==============================] - 1s 4ms/step - loss: 1.2487 - accuracy: 0.5534 - val_loss: 1.4223 - val_accuracy: 0.5034 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 6/200 313/313 [==============================] - 1s 4ms/step - loss: 1.1295 - accuracy: 0.5972 - val_loss: 1.4233 - val_accuracy: 0.5193 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 7/200 313/313 [==============================] - 1s 4ms/step - loss: 1.0259 - accuracy: 0.6317 - val_loss: 1.4227 - val_accuracy: 0.5247 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets Epoch 8/200 313/313 [==============================] - 1s 4ms/step - loss: 0.9660 - accuracy: 0.6561 - val_loss: 1.4760 - val_accuracy: 0.5167 Epoch 9/200 313/313 [==============================] - 1s 4ms/step - loss: 1.4212 - accuracy: 0.4974 - val_loss: 1.6058 - val_accuracy: 0.4591 Epoch 10/200 313/313 [==============================] - 1s 4ms/step - loss: 1.3722 - accuracy: 0.5135 - val_loss: 1.6096 - val_accuracy: 0.4570 Epoch 11/200 313/313 [==============================] - 1s 4ms/step - loss: 1.3302 - accuracy: 0.5334 - val_loss: 1.5749 - val_accuracy: 0.4721 Epoch 12/200 313/313 [==============================] - 1s 4ms/step - loss: 1.2905 - accuracy: 0.5460 - val_loss: 1.5626 - val_accuracy: 0.4740 Epoch 13/200 313/313 [==============================] - 1s 4ms/step - loss: 1.2215 - accuracy: 0.5648 - val_loss: 1.5409 - val_accuracy: 0.4932 Epoch 14/200 313/313 [==============================] - 1s 4ms/step - loss: 1.1479 - accuracy: 0.5914 - val_loss: 1.5839 - val_accuracy: 0.4913 Epoch 15/200 313/313 [==============================] - 1s 4ms/step - loss: 1.0856 - accuracy: 0.6109 - val_loss: 1.5513 - val_accuracy: 0.5003 Epoch 16/200 313/313 [==============================] - 1s 4ms/step - loss: 1.0063 - accuracy: 0.6432 - val_loss: 1.5528 - val_accuracy: 0.4969 Epoch 17/200 313/313 [==============================] - 1s 4ms/step - loss: 0.9151 - accuracy: 0.6712 - val_loss: 1.5783 - val_accuracy: 0.5061 Epoch 18/200 313/313 [==============================] - 1s 4ms/step - loss: 0.8195 - accuracy: 0.7049 - val_loss: 1.6308 - val_accuracy: 0.5161 Epoch 19/200 313/313 [==============================] - 1s 4ms/step - loss: 0.7336 - accuracy: 0.7358 - val_loss: 1.6849 - val_accuracy: 0.5151 Epoch 20/200 313/313 [==============================] - 1s 4ms/step - loss: 0.6494 - accuracy: 0.7646 - val_loss: 1.7745 - val_accuracy: 0.5189 Epoch 21/200 313/313 [==============================] - 1s 4ms/step - loss: 0.5786 - accuracy: 0.7929 - val_loss: 1.8157 - val_accuracy: 0.5190 Epoch 22/200 313/313 [==============================] - 1s 4ms/step - loss: 0.5257 - accuracy: 0.8139 - val_loss: 1.8669 - val_accuracy: 0.5150 Epoch 23/200 313/313 [==============================] - 1s 4ms/step - loss: 0.4920 - accuracy: 0.8271 - val_loss: 1.8813 - val_accuracy: 0.5182 Epoch 24/200 313/313 [==============================] - 1s 4ms/step - loss: 0.4786 - accuracy: 0.8341 - val_loss: 2.0694 - val_accuracy: 0.4917 Epoch 25/200 313/313 [==============================] - 1s 4ms/step - loss: 1.2771 - accuracy: 0.5621 - val_loss: 1.6602 - val_accuracy: 0.4724 Epoch 26/200 313/313 [==============================] - 1s 4ms/step - loss: 1.2242 - accuracy: 0.5742 - val_loss: 1.6207 - val_accuracy: 0.4690 Epoch 27/200 313/313 [==============================] - 1s 4ms/step - loss: 1.2011 - accuracy: 0.5831 - val_loss: 1.7225 - val_accuracy: 0.4639 Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP
display(results)
fig.show()
Epochs 27 Batch Size 128 Model Name Baseline_MLP Remarks Model Path /content/drive/MyDrive/Data/DELE CA1/CIFAR10/S... Train Loss 1.02592 Test Loss 1.42272 Train Acc 0.63175 Test Acc 0.5247 [Train - Test] Acc 0.10705 dtype: object
We can see that the overall performance of the model is poor, as the validation and training accuracy is poor.
evaluator.add_remarks("Baseline_MLP", "High Variance")
To do better, I decide to move towards a CNN architecture. The main reason why I want to move towards a CNN architecture is because CNNs are well suited to the problem of image classification
To begin with, we will construct a simple CNN, loosely based off VGGNet.
Instead of just wholesale taking the VGG architecture, I downsize it and make certain improvements. The reason I do so is
So, here's what I've done:
The end result is a much smaller network than the original VGG networks
My implementation is forked from that given in Dive into Deep Learning
class BasicConvLayer(Model):
"""
Basic Conv Layer, with Batch Normalization done before activation
"""
def __init__(self, filters , activation = ReLU):
super(BasicConvLayer, self).__init__() # subclassing a layer
self.layer_conv_1 = Conv2D(filters, (3, 3), padding='same', strides=1, kernel_regularizer=l2(WEIGHT_DECAY))
self.layer_bn_1 = BatchNormalization()
self.layer_activation_1 = activation()
def call(self, X):
X = self.layer_conv_1(X)
X = self.layer_bn_1(X)
return self.layer_activation_1(X)
class BasicConvBlock(Layer):
def __init__(self, no_layers, filters, activation=ReLU):
super(BasicConvBlock, self).__init__()
self.conv_block = Sequential()
for _ in range(no_layers):
self.conv_block.add(
BasicConvLayer(filters, activation=activation)
)
self.conv_block.add(MaxPool2D(strides=2))
def call(self, X):
return self.conv_block(X)
def build_baseline_cnn(optimizer, loss='categorical_crossentropy', name='Baseline_CNN_1'):
"""
A modified and cut down version of VGG16, using GlobalPooling, and a modified VGGBlock
"""
inputs = Input(IMG_SIZE) # Input
x = pre_processing_v1(inputs)
x = BasicConvBlock(2, 32)(x)
x = BasicConvBlock(2, 64)(x)
x = BasicConvBlock(3, 128)(x)
x = BasicConvBlock(3, 256)(x)
# Global Pooling
x = GlobalAveragePooling2D()(x)
# Classification Head
x = Dense(128, activation='relu', kernel_regularizer=l2(WEIGHT_DECAY))(x)
x = Dropout(0.3)(x)
x = Dense(10, 'softmax', kernel_regularizer=l2(WEIGHT_DECAY))(x)
model = Model(inputs=inputs, outputs=x, name=name)
model.compile(optimizer=optimizer,loss=loss, metrics=['accuracy'])
print(model.summary())
return model
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
model = build_baseline_cnn(optimizer)
Model: "Baseline_CNN_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 32, 32, 3)] 0 _________________________________________________________________ normalization_1 (Normalizati (None, 32, 32, 3) 7 _________________________________________________________________ basic_conv_block (BasicConvB (None, 16, 16, 32) 10400 _________________________________________________________________ basic_conv_block_1 (BasicCon (None, 8, 8, 64) 55936 _________________________________________________________________ basic_conv_block_2 (BasicCon (None, 4, 4, 128) 370560 _________________________________________________________________ basic_conv_block_3 (BasicCon (None, 2, 2, 256) 1478400 _________________________________________________________________ global_average_pooling2d (Gl (None, 256) 0 _________________________________________________________________ dense (Dense) (None, 128) 32896 _________________________________________________________________ dropout (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 10) 1290 ================================================================= Total params: 1,949,489 Trainable params: 1,946,794 Non-trainable params: 2,695 _________________________________________________________________ None
results, fig = evaluator.evaluate_model(model)
Training Baseline_CNN_1 Epoch 1/200 313/313 [==============================] - 20s 13ms/step - loss: 2.2388 - accuracy: 0.4223 - val_loss: 2.0846 - val_accuracy: 0.4518 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 2/200 313/313 [==============================] - 4s 11ms/step - loss: 1.6751 - accuracy: 0.6023 - val_loss: 2.4074 - val_accuracy: 0.4286 Epoch 3/200 313/313 [==============================] - 3s 11ms/step - loss: 1.3826 - accuracy: 0.6946 - val_loss: 1.3776 - val_accuracy: 0.6716 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 4/200 313/313 [==============================] - 4s 11ms/step - loss: 1.1938 - accuracy: 0.7433 - val_loss: 1.2493 - val_accuracy: 0.7180 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 5/200 313/313 [==============================] - 4s 11ms/step - loss: 1.0579 - accuracy: 0.7814 - val_loss: 1.2183 - val_accuracy: 0.7242 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 6/200 313/313 [==============================] - 4s 11ms/step - loss: 0.9565 - accuracy: 0.8052 - val_loss: 1.5998 - val_accuracy: 0.6365 Epoch 7/200 313/313 [==============================] - 4s 11ms/step - loss: 0.8819 - accuracy: 0.8243 - val_loss: 1.1196 - val_accuracy: 0.7531 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 8/200 313/313 [==============================] - 4s 11ms/step - loss: 0.8193 - accuracy: 0.8446 - val_loss: 1.1811 - val_accuracy: 0.7311 Epoch 9/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7629 - accuracy: 0.8594 - val_loss: 1.1969 - val_accuracy: 0.7294 Epoch 10/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7206 - accuracy: 0.8706 - val_loss: 1.2203 - val_accuracy: 0.7286 Epoch 11/200 313/313 [==============================] - 4s 11ms/step - loss: 0.6783 - accuracy: 0.8863 - val_loss: 1.0941 - val_accuracy: 0.7690 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 12/200 313/313 [==============================] - 4s 11ms/step - loss: 0.6352 - accuracy: 0.8990 - val_loss: 1.0221 - val_accuracy: 0.7898 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 13/200 313/313 [==============================] - 4s 11ms/step - loss: 0.5965 - accuracy: 0.9111 - val_loss: 1.4145 - val_accuracy: 0.7277 Epoch 14/200 313/313 [==============================] - 4s 11ms/step - loss: 0.5606 - accuracy: 0.9241 - val_loss: 1.2590 - val_accuracy: 0.7455 Epoch 15/200 313/313 [==============================] - 3s 11ms/step - loss: 0.5233 - accuracy: 0.9359 - val_loss: 1.2282 - val_accuracy: 0.7502 Epoch 16/200 313/313 [==============================] - 3s 11ms/step - loss: 0.4825 - accuracy: 0.9475 - val_loss: 1.0742 - val_accuracy: 0.7915 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 17/200 313/313 [==============================] - 4s 11ms/step - loss: 0.4385 - accuracy: 0.9590 - val_loss: 1.0264 - val_accuracy: 0.8122 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 18/200 313/313 [==============================] - 4s 11ms/step - loss: 0.4039 - accuracy: 0.9688 - val_loss: 1.0666 - val_accuracy: 0.8096 Epoch 19/200 313/313 [==============================] - 4s 11ms/step - loss: 0.3625 - accuracy: 0.9794 - val_loss: 1.0299 - val_accuracy: 0.8207 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 20/200 313/313 [==============================] - 4s 11ms/step - loss: 0.3309 - accuracy: 0.9865 - val_loss: 1.1381 - val_accuracy: 0.8127 Epoch 21/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2958 - accuracy: 0.9947 - val_loss: 0.9685 - val_accuracy: 0.8504 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 22/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2761 - accuracy: 0.9975 - val_loss: 0.9815 - val_accuracy: 0.8499 Epoch 23/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2629 - accuracy: 0.9991 - val_loss: 0.9176 - val_accuracy: 0.8602 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 24/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2543 - accuracy: 0.9995 - val_loss: 0.9693 - val_accuracy: 0.8577 Epoch 25/200 313/313 [==============================] - 4s 12ms/step - loss: 0.2492 - accuracy: 0.9998 - val_loss: 0.9231 - val_accuracy: 0.8641 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 26/200 313/313 [==============================] - 4s 12ms/step - loss: 0.2461 - accuracy: 0.9999 - val_loss: 0.9254 - val_accuracy: 0.8654 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 27/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2444 - accuracy: 0.9998 - val_loss: 0.9245 - val_accuracy: 0.8665 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets Epoch 28/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2436 - accuracy: 0.9999 - val_loss: 0.9242 - val_accuracy: 0.8659 Epoch 29/200 313/313 [==============================] - 4s 11ms/step - loss: 0.2433 - accuracy: 0.9999 - val_loss: 1.0775 - val_accuracy: 0.8308 Epoch 30/200 313/313 [==============================] - 4s 11ms/step - loss: 0.9712 - accuracy: 0.7921 - val_loss: 1.5437 - val_accuracy: 0.6530 Epoch 31/200 313/313 [==============================] - 4s 11ms/step - loss: 0.8212 - accuracy: 0.8497 - val_loss: 1.1908 - val_accuracy: 0.7569 Epoch 32/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7965 - accuracy: 0.8692 - val_loss: 1.5472 - val_accuracy: 0.6729 Epoch 33/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7864 - accuracy: 0.8799 - val_loss: 2.0475 - val_accuracy: 0.6237 Epoch 34/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7745 - accuracy: 0.8888 - val_loss: 1.2631 - val_accuracy: 0.7527 Epoch 35/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7827 - accuracy: 0.8885 - val_loss: 1.2453 - val_accuracy: 0.7644 Epoch 36/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7760 - accuracy: 0.8959 - val_loss: 1.3278 - val_accuracy: 0.7404 Epoch 37/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7588 - accuracy: 0.9036 - val_loss: 1.3316 - val_accuracy: 0.7317 Epoch 38/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7725 - accuracy: 0.9028 - val_loss: 1.4100 - val_accuracy: 0.7253 Epoch 39/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7682 - accuracy: 0.9082 - val_loss: 1.3048 - val_accuracy: 0.7580 Epoch 40/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7546 - accuracy: 0.9126 - val_loss: 1.4260 - val_accuracy: 0.7429 Epoch 41/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7635 - accuracy: 0.9121 - val_loss: 1.4374 - val_accuracy: 0.7453 Epoch 42/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7592 - accuracy: 0.9161 - val_loss: 1.8288 - val_accuracy: 0.6477 Epoch 43/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7392 - accuracy: 0.9225 - val_loss: 1.5203 - val_accuracy: 0.7303 Epoch 44/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7356 - accuracy: 0.9218 - val_loss: 1.2319 - val_accuracy: 0.7768 Epoch 45/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7326 - accuracy: 0.9258 - val_loss: 1.3817 - val_accuracy: 0.7580 Epoch 46/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7252 - accuracy: 0.9288 - val_loss: 1.4519 - val_accuracy: 0.7511 Epoch 47/200 313/313 [==============================] - 4s 11ms/step - loss: 0.7142 - accuracy: 0.9317 - val_loss: 1.5385 - val_accuracy: 0.7337 Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1
display(results)
fig.show()
Epochs 47 Batch Size 128 Model Name Baseline_CNN_1 Remarks Model Path /content/drive/MyDrive/Data/DELE CA1/CIFAR10/S... Train Loss 0.24442 Test Loss 0.924474 Train Acc 0.999825 Test Acc 0.8665 [Train - Test] Acc 0.133325 dtype: object
By observing the learning curve, we can see that this baseline CNN begins overfitting to our data. As such, our attention now turns towards reducing the variance of our data.
evaluator.add_remarks("Baseline_CNN_1", "Low Avoidable Bias but Overfits")
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
evaluator.save_history()
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
baseline_data_aug = build_baseline_cnn(optimizer, name="Baseline_CNN_1_DataAug")
Model: "Baseline_CNN_1_DataAug" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 32, 32, 3)] 0 _________________________________________________________________ normalization_1 (Normalizati (None, 32, 32, 3) 7 _________________________________________________________________ basic_conv_block (BasicConvB (None, 16, 16, 32) 10400 _________________________________________________________________ basic_conv_block_1 (BasicCon (None, 8, 8, 64) 55936 _________________________________________________________________ basic_conv_block_2 (BasicCon (None, 4, 4, 128) 370560 _________________________________________________________________ basic_conv_block_3 (BasicCon (None, 2, 2, 256) 1478400 _________________________________________________________________ global_average_pooling2d (Gl (None, 256) 0 _________________________________________________________________ dense (Dense) (None, 128) 32896 _________________________________________________________________ dropout (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 10) 1290 ================================================================= Total params: 1,949,489 Trainable params: 1,946,794 Non-trainable params: 2,695 _________________________________________________________________ None
train_ds, val_ds = set_up_data_aug()
results, fig = evaluator.evaluate_model(baseline_data_aug, training_data=train_ds, validation_data=val_ds)
Training Baseline_CNN_1_DataAug Epoch 1/200 313/313 [==============================] - 7s 15ms/step - loss: 2.2975 - accuracy: 0.3930 - val_loss: 2.0437 - val_accuracy: 0.4737 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 2/200 313/313 [==============================] - 5s 15ms/step - loss: 1.7573 - accuracy: 0.5741 - val_loss: 2.2887 - val_accuracy: 0.4701 Epoch 3/200 313/313 [==============================] - 5s 15ms/step - loss: 1.4587 - accuracy: 0.6651 - val_loss: 2.4402 - val_accuracy: 0.4929 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 4/200 313/313 [==============================] - 5s 15ms/step - loss: 1.2598 - accuracy: 0.7246 - val_loss: 1.6876 - val_accuracy: 0.6288 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 5/200 313/313 [==============================] - 5s 15ms/step - loss: 1.1339 - accuracy: 0.7552 - val_loss: 1.4338 - val_accuracy: 0.6767 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 6/200 313/313 [==============================] - 5s 15ms/step - loss: 1.0402 - accuracy: 0.7786 - val_loss: 1.4759 - val_accuracy: 0.6423 Epoch 7/200 313/313 [==============================] - 5s 14ms/step - loss: 0.9536 - accuracy: 0.8018 - val_loss: 1.0511 - val_accuracy: 0.7574 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 8/200 313/313 [==============================] - 5s 15ms/step - loss: 0.8920 - accuracy: 0.8180 - val_loss: 1.2989 - val_accuracy: 0.6885 Epoch 9/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8412 - accuracy: 0.8313 - val_loss: 1.4148 - val_accuracy: 0.6746 Epoch 10/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8061 - accuracy: 0.8412 - val_loss: 1.0903 - val_accuracy: 0.7526 Epoch 11/200 313/313 [==============================] - 5s 14ms/step - loss: 0.7514 - accuracy: 0.8557 - val_loss: 1.3578 - val_accuracy: 0.7146 Epoch 12/200 313/313 [==============================] - 5s 14ms/step - loss: 0.7283 - accuracy: 0.8656 - val_loss: 0.9513 - val_accuracy: 0.7888 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 13/200 313/313 [==============================] - 5s 15ms/step - loss: 0.6933 - accuracy: 0.8766 - val_loss: 1.1800 - val_accuracy: 0.7379 Epoch 14/200 313/313 [==============================] - 5s 15ms/step - loss: 0.6746 - accuracy: 0.8802 - val_loss: 1.0983 - val_accuracy: 0.7621 Epoch 15/200 313/313 [==============================] - 5s 15ms/step - loss: 0.6428 - accuracy: 0.8916 - val_loss: 1.3826 - val_accuracy: 0.7084 Epoch 16/200 313/313 [==============================] - 5s 14ms/step - loss: 0.6225 - accuracy: 0.8978 - val_loss: 1.3323 - val_accuracy: 0.7116 Epoch 17/200 313/313 [==============================] - 5s 14ms/step - loss: 0.5993 - accuracy: 0.9062 - val_loss: 1.0189 - val_accuracy: 0.7826 Epoch 18/200 313/313 [==============================] - 5s 14ms/step - loss: 0.5765 - accuracy: 0.9118 - val_loss: 1.0184 - val_accuracy: 0.7913 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 19/200 313/313 [==============================] - 5s 15ms/step - loss: 0.5563 - accuracy: 0.9181 - val_loss: 1.0483 - val_accuracy: 0.7822 Epoch 20/200 313/313 [==============================] - 5s 14ms/step - loss: 0.5327 - accuracy: 0.9269 - val_loss: 0.9026 - val_accuracy: 0.8183 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 21/200 313/313 [==============================] - 5s 14ms/step - loss: 0.5126 - accuracy: 0.9299 - val_loss: 0.9121 - val_accuracy: 0.8193 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 22/200 313/313 [==============================] - 5s 15ms/step - loss: 0.4844 - accuracy: 0.9392 - val_loss: 0.8964 - val_accuracy: 0.8233 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 23/200 313/313 [==============================] - 5s 14ms/step - loss: 0.4666 - accuracy: 0.9444 - val_loss: 1.0940 - val_accuracy: 0.7872 Epoch 24/200 313/313 [==============================] - 5s 15ms/step - loss: 0.4537 - accuracy: 0.9483 - val_loss: 0.7816 - val_accuracy: 0.8509 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 25/200 313/313 [==============================] - 5s 15ms/step - loss: 0.4249 - accuracy: 0.9543 - val_loss: 0.8736 - val_accuracy: 0.8291 Epoch 26/200 313/313 [==============================] - 5s 15ms/step - loss: 0.4002 - accuracy: 0.9616 - val_loss: 0.9713 - val_accuracy: 0.8100 Epoch 27/200 313/313 [==============================] - 5s 15ms/step - loss: 0.3948 - accuracy: 0.9625 - val_loss: 0.8794 - val_accuracy: 0.8321 Epoch 28/200 313/313 [==============================] - 5s 14ms/step - loss: 0.3730 - accuracy: 0.9683 - val_loss: 0.9757 - val_accuracy: 0.8207 Epoch 29/200 313/313 [==============================] - 5s 14ms/step - loss: 0.3519 - accuracy: 0.9730 - val_loss: 0.8389 - val_accuracy: 0.8469 Epoch 30/200 313/313 [==============================] - 5s 14ms/step - loss: 0.3316 - accuracy: 0.9774 - val_loss: 0.8484 - val_accuracy: 0.8427 Epoch 31/200 313/313 [==============================] - 5s 14ms/step - loss: 0.3142 - accuracy: 0.9810 - val_loss: 0.8543 - val_accuracy: 0.8456 Epoch 32/200 313/313 [==============================] - 5s 14ms/step - loss: 0.2907 - accuracy: 0.9859 - val_loss: 0.7945 - val_accuracy: 0.8630 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 33/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2804 - accuracy: 0.9880 - val_loss: 0.8034 - val_accuracy: 0.8614 Epoch 34/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2653 - accuracy: 0.9908 - val_loss: 0.7825 - val_accuracy: 0.8628 Epoch 35/200 313/313 [==============================] - 5s 14ms/step - loss: 0.2584 - accuracy: 0.9924 - val_loss: 0.7324 - val_accuracy: 0.8738 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 36/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2474 - accuracy: 0.9936 - val_loss: 0.7398 - val_accuracy: 0.8733 Epoch 37/200 313/313 [==============================] - 5s 14ms/step - loss: 0.2443 - accuracy: 0.9934 - val_loss: 0.7401 - val_accuracy: 0.8702 Epoch 38/200 313/313 [==============================] - 5s 14ms/step - loss: 0.2392 - accuracy: 0.9947 - val_loss: 0.7135 - val_accuracy: 0.8779 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 39/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2303 - accuracy: 0.9958 - val_loss: 0.7170 - val_accuracy: 0.8802 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 40/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2310 - accuracy: 0.9955 - val_loss: 0.7238 - val_accuracy: 0.8763 Epoch 41/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2292 - accuracy: 0.9957 - val_loss: 0.6973 - val_accuracy: 0.8820 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 42/200 313/313 [==============================] - 5s 14ms/step - loss: 0.2254 - accuracy: 0.9966 - val_loss: 0.6978 - val_accuracy: 0.8826 INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_MLP/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1/assets INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug/assets Epoch 43/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2212 - accuracy: 0.9974 - val_loss: 0.6971 - val_accuracy: 0.8821 Epoch 44/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2225 - accuracy: 0.9971 - val_loss: 0.6972 - val_accuracy: 0.8824 Epoch 45/200 313/313 [==============================] - 5s 15ms/step - loss: 0.2252 - accuracy: 0.9959 - val_loss: 0.6968 - val_accuracy: 0.8825 Epoch 46/200 313/313 [==============================] - 5s 14ms/step - loss: 0.3885 - accuracy: 0.9462 - val_loss: 11.8691 - val_accuracy: 0.3811 Epoch 47/200 313/313 [==============================] - 5s 15ms/step - loss: 0.9687 - accuracy: 0.7875 - val_loss: 1.2157 - val_accuracy: 0.7076 Epoch 48/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8704 - accuracy: 0.8316 - val_loss: 1.1947 - val_accuracy: 0.7505 Epoch 49/200 313/313 [==============================] - 5s 15ms/step - loss: 0.8495 - accuracy: 0.8486 - val_loss: 2.1483 - val_accuracy: 0.5426 Epoch 50/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8400 - accuracy: 0.8565 - val_loss: 1.6069 - val_accuracy: 0.6373 Epoch 51/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8353 - accuracy: 0.8632 - val_loss: 2.5062 - val_accuracy: 0.5568 Epoch 52/200 313/313 [==============================] - 5s 15ms/step - loss: 0.8242 - accuracy: 0.8705 - val_loss: 1.2434 - val_accuracy: 0.7558 Epoch 53/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8287 - accuracy: 0.8726 - val_loss: 1.2620 - val_accuracy: 0.7489 Epoch 54/200 313/313 [==============================] - 5s 15ms/step - loss: 0.8192 - accuracy: 0.8785 - val_loss: 1.4763 - val_accuracy: 0.7137 Epoch 55/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8160 - accuracy: 0.8798 - val_loss: 1.9741 - val_accuracy: 0.6075 Epoch 56/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8161 - accuracy: 0.8802 - val_loss: 1.3600 - val_accuracy: 0.7362 Epoch 57/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8133 - accuracy: 0.8849 - val_loss: 1.2243 - val_accuracy: 0.7623 Epoch 58/200 313/313 [==============================] - 5s 15ms/step - loss: 0.8115 - accuracy: 0.8884 - val_loss: 1.4443 - val_accuracy: 0.7161 Epoch 59/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8146 - accuracy: 0.8872 - val_loss: 1.2077 - val_accuracy: 0.7540 Epoch 60/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8204 - accuracy: 0.8839 - val_loss: 1.3467 - val_accuracy: 0.7350 Epoch 61/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8164 - accuracy: 0.8902 - val_loss: 1.4236 - val_accuracy: 0.7226 Epoch 62/200 313/313 [==============================] - 5s 14ms/step - loss: 0.8028 - accuracy: 0.8947 - val_loss: 1.6637 - val_accuracy: 0.6855 Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/Baseline_CNN_1_DataAug
display(results)
fig.show()
Epochs 62 Batch Size 128 Model Name Baseline_CNN_1_DataAug Remarks Model Path /content/drive/MyDrive/Data/DELE CA1/CIFAR10/S... Train Loss 0.225433 Test Loss 0.697799 Train Acc 0.996625 Test Acc 0.8826 [Train - Test] Acc 0.114025 dtype: object
evaluator.add_remarks("Baseline_CNN_1_DataAug", "Data Aug lowers variance, but still overfits")
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
| 2 | Baseline_CNN_1_DataAug | 0.996625 | 0.8826 | 0.114025 | Data Aug lowers variance, but still overfits |
evaluator.save_history()
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
To improve upon our baseline, I make use of the Wide ResNet architecture. In summary, Wide ResNet is a Residual Network that goes for width instead of Depth. Although this would be expected to result in more overfitting, the overfitting can be counteracted by regularization, dropout and data augmentation. The benefits of going wide however are that:
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
| 2 | Baseline_CNN_1_DataAug | 0.996625 | 0.8826 | 0.114025 | Data Aug lowers variance, but still overfits |
| 3 | efficientnetv2-s | 0.320450 | 0.3495 | -0.029050 |
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
| 2 | Baseline_CNN_1_DataAug | 0.996625 | 0.8826 | 0.114025 | Data Aug lowers variance, but still overfits |
| 3 | efficientnetv2-s | 0.320450 | 0.3495 | -0.029050 | NaN |
| 7 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.997800 | 0.8739 | 0.123900 | NaN |
| 8 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.999425 | 0.9398 | 0.059625 | NaN |
| 9 | ImprovedWideResNet_28_10_ProperDropout_No_Stoc... | 0.998900 | 0.9456 | 0.053300 | Best model thus far |
| 10 | ImprovedWideResNet_28_10_Dropout_No_Stochastic... | 0.999850 | 0.9170 | 0.082850 | NaN |
| 11 | WideResNet_28_10_Fixed_BasicDataAug | 0.998625 | 0.9333 | 0.065325 | NaN |
| 12 | WideResNet_28_10_Fixed_CutMix | 0.916025 | 0.9391 | -0.023075 | NaN |
# code adapted for Keras https://github.com/szagoruyko/wide-residual-networks/blob/master/models/wide-resnet.lua
class WideResNetLayer(Model):
"""
B(3,3) block was found to perform the best
Dropout is added between the convolutions, after activations
Order of BN, Conv and ReLU changed to BN-Relu-Conv
"""
def __init__(self, num_channels, k, use_1x1conv=False, strides=1, activation='relu'):
super(WideResNetLayer, self).__init__() # subclassing a layer
self.activation = Activation(activation)
self.bn_1 = BatchNormalization(epsilon=1e-5, gamma_initializer='uniform') #
self.conv_1 = Conv2D(num_channels * k, 3, strides=strides, padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')
self.dropout = Dropout(0.3) # 0.3 was found to be optimal for CIFAR-10
self.bn_2 = BatchNormalization(epsilon=1e-5, gamma_initializer='uniform')
self.conv_2 = Conv2D(num_channels * k, 3, strides=1, padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')
if use_1x1conv:
self.conv3 = Conv2D(num_channels * k, 1, strides=strides,padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')
else:
self.conv3 = None
def call(self, X):
Y = self.bn_1(X)
Y = self.activation(Y)
Y = self.conv_1(Y)
Y = self.bn_2(Y)
Y = self.activation(Y)
Y = self.dropout(Y)
Y = self.conv_2(Y)
if self.conv3 is not None:
X = self.conv3(X) # shortcut connection
Y += X
return Y
class WideResnetBlock(Layer):
def __init__(self, num_channels, k, num_residuals, first_block=False, layer=WideResNetLayer, activation='relu', dropout=0.3,
**kwargs):
super(WideResnetBlock, self).__init__(**kwargs)
self.residual_layers = []
for i in range(num_residuals):
if first_block:
self.residual_layers.append(
layer(num_channels, k, use_1x1conv=True, strides=1, activation=activation, dropout=dropout))
elif i == 0:
self.residual_layers.append(
layer(num_channels, k, use_1x1conv=True, strides=2, activation=activation, dropout=dropout))
else:
self.residual_layers.append(layer(num_channels, k, strides=1, activation=activation, dropout=dropout))
def call(self, X):
for layer in self.residual_layers.layers:
X = layer(X)
return X
def build_wideresnet(optimizer, loss='categorical_crossentropy',name = "WideResNet_28_10", layer=WideResNetLayer, activation='relu'):
"""
WideResNet
- Depth of 28
- All filters multiplied by a factor of 10
"""
k = 10
inputs = Input(IMG_SIZE) # Input
x = pre_processing_v1(inputs) # Normalization
x = Conv2D(16, 3, padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')(x)
x = WideResnetBlock(16, k, 4, first_block=True, activation=activation, layer=layer)(x)
x = WideResnetBlock(32, k, 4, activation=activation, layer=layer)(x)
x = WideResnetBlock(64, k, 4, activation=activation, layer=layer)(x)
x = BatchNormalization(epsilon=1e-5, gamma_initializer='uniform')(x)
x = Activation(activation)(x)
# Global Pooling
x = GlobalAveragePooling2D()(x)
# Classification Head
x = Dense(10, 'softmax', kernel_regularizer=l2(WEIGHT_DECAY), kernel_initializer='he_normal')(x)
model = Model(inputs=inputs, outputs=x, name=name)
model.compile(optimizer=optimizer,loss=loss, metrics=['accuracy'])
print(model.summary())
return model
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
callbacks = [
TerminateOnNaN(),
CSVLogger("/tmp/training.log", append=False)
]
train_ds, val_ds = set_up_data_aug()
model = build_wideresnet(optimizer, name="WideResNet_28_10_Fixed_BasicDataAug")
results, fig = evaluator.evaluate_model(model, training_data=train_ds, validation_data=val_ds,callbacks=callbacks)
evaluator.save_history()
Model: "WideResNet_28_10_Fixed_BasicDataAug"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 32, 32, 3)] 0
normalization (Normalizatio (None, 32, 32, 3) 7
n)
conv2d_31 (Conv2D) (None, 32, 32, 16) 432
wide_resnet_block_3 (WideRe (None, 32, 32, 160) 1719744
snetBlock)
wide_resnet_block_4 (WideRe (None, 16, 16, 320) 6972800
snetBlock)
wide_resnet_block_5 (WideRe (None, 8, 8, 640) 27872000
snetBlock)
batch_normalization_49 (Bat (None, 8, 8, 640) 2560
chNormalization)
re_lu_3 (ReLU) (None, 8, 8, 640) 0
global_average_pooling2d_1 (None, 640) 0
(GlobalAveragePooling2D)
dense (Dense) (None, 10) 6410
=================================================================
Total params: 36,573,953
Trainable params: 36,555,994
Non-trainable params: 17,959
_________________________________________________________________
None
Training WideResNet_28_10_Fixed_BasicDataAug
Epoch 1/200
313/313 [==============================] - ETA: 0s - loss: 11.5703 - accuracy: 0.2194INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 92s 280ms/step - loss: 11.5703 - accuracy: 0.2194 - val_loss: 10.4587 - val_accuracy: 0.2364
Epoch 2/200
313/313 [==============================] - ETA: 0s - loss: 9.5799 - accuracy: 0.2831INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 272ms/step - loss: 9.5799 - accuracy: 0.2831 - val_loss: 8.9163 - val_accuracy: 0.2841
Epoch 3/200
313/313 [==============================] - ETA: 0s - loss: 8.3078 - accuracy: 0.3469INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 277ms/step - loss: 8.3078 - accuracy: 0.3469 - val_loss: 8.1845 - val_accuracy: 0.2851
Epoch 4/200
313/313 [==============================] - ETA: 0s - loss: 7.6155 - accuracy: 0.3976INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 275ms/step - loss: 7.6155 - accuracy: 0.3976 - val_loss: 7.4581 - val_accuracy: 0.3873
Epoch 5/200
313/313 [==============================] - ETA: 0s - loss: 7.2732 - accuracy: 0.4408INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 89s 281ms/step - loss: 7.2732 - accuracy: 0.4408 - val_loss: 7.2088 - val_accuracy: 0.4463
Epoch 6/200
313/313 [==============================] - 65s 206ms/step - loss: 7.1838 - accuracy: 0.4441 - val_loss: 9.5594 - val_accuracy: 0.2337
Epoch 7/200
313/313 [==============================] - 65s 205ms/step - loss: 6.1972 - accuracy: 0.4526 - val_loss: 5.6923 - val_accuracy: 0.4035
Epoch 8/200
313/313 [==============================] - ETA: 0s - loss: 4.6847 - accuracy: 0.5691INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 273ms/step - loss: 4.6847 - accuracy: 0.5691 - val_loss: 4.7705 - val_accuracy: 0.4642
Epoch 9/200
313/313 [==============================] - 65s 206ms/step - loss: 3.6372 - accuracy: 0.6362 - val_loss: 6.8380 - val_accuracy: 0.3063
Epoch 10/200
313/313 [==============================] - ETA: 0s - loss: 2.9137 - accuracy: 0.6798INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 277ms/step - loss: 2.9137 - accuracy: 0.6798 - val_loss: 3.2522 - val_accuracy: 0.5105
Epoch 11/200
313/313 [==============================] - ETA: 0s - loss: 2.3742 - accuracy: 0.7229INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 276ms/step - loss: 2.3742 - accuracy: 0.7229 - val_loss: 2.6674 - val_accuracy: 0.5878
Epoch 12/200
313/313 [==============================] - ETA: 0s - loss: 1.9912 - accuracy: 0.7592INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 275ms/step - loss: 1.9912 - accuracy: 0.7592 - val_loss: 2.0590 - val_accuracy: 0.7072
Epoch 13/200
313/313 [==============================] - ETA: 0s - loss: 1.6915 - accuracy: 0.7968INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 277ms/step - loss: 1.6915 - accuracy: 0.7968 - val_loss: 1.8403 - val_accuracy: 0.7288
Epoch 14/200
313/313 [==============================] - ETA: 0s - loss: 1.4592 - accuracy: 0.8277INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 273ms/step - loss: 1.4592 - accuracy: 0.8277 - val_loss: 1.5068 - val_accuracy: 0.7950
Epoch 15/200
313/313 [==============================] - 65s 205ms/step - loss: 1.2814 - accuracy: 0.8556 - val_loss: 1.4767 - val_accuracy: 0.7856
Epoch 16/200
313/313 [==============================] - ETA: 0s - loss: 1.1363 - accuracy: 0.8810INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 272ms/step - loss: 1.1363 - accuracy: 0.8810 - val_loss: 1.3229 - val_accuracy: 0.8160
Epoch 17/200
313/313 [==============================] - ETA: 0s - loss: 1.0094 - accuracy: 0.9090INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 272ms/step - loss: 1.0094 - accuracy: 0.9090 - val_loss: 1.1856 - val_accuracy: 0.8482
Epoch 18/200
313/313 [==============================] - ETA: 0s - loss: 0.8991 - accuracy: 0.9351INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 271ms/step - loss: 0.8991 - accuracy: 0.9351 - val_loss: 1.0476 - val_accuracy: 0.8814
Epoch 19/200
313/313 [==============================] - ETA: 0s - loss: 0.8191 - accuracy: 0.9551INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 273ms/step - loss: 0.8191 - accuracy: 0.9551 - val_loss: 0.9961 - val_accuracy: 0.8964
Epoch 20/200
313/313 [==============================] - ETA: 0s - loss: 0.7572 - accuracy: 0.9746INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 276ms/step - loss: 0.7572 - accuracy: 0.9746 - val_loss: 0.9682 - val_accuracy: 0.9059
Epoch 21/200
313/313 [==============================] - 65s 206ms/step - loss: 0.7256 - accuracy: 0.9835 - val_loss: 0.9598 - val_accuracy: 0.9054
Epoch 22/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8427 - accuracy: 0.9435 - val_loss: 22.2469 - val_accuracy: 0.2381
Epoch 23/200
313/313 [==============================] - 65s 205ms/step - loss: 1.2463 - accuracy: 0.7984 - val_loss: 1.9935 - val_accuracy: 0.5859
Epoch 24/200
313/313 [==============================] - 65s 205ms/step - loss: 1.0935 - accuracy: 0.8312 - val_loss: 1.5098 - val_accuracy: 0.7050
Epoch 25/200
313/313 [==============================] - 65s 206ms/step - loss: 0.9958 - accuracy: 0.8471 - val_loss: 1.4628 - val_accuracy: 0.7153
Epoch 26/200
313/313 [==============================] - 65s 205ms/step - loss: 0.9361 - accuracy: 0.8554 - val_loss: 1.2006 - val_accuracy: 0.7619
Epoch 27/200
313/313 [==============================] - 65s 205ms/step - loss: 0.8816 - accuracy: 0.8676 - val_loss: 1.0567 - val_accuracy: 0.8130
Epoch 28/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8380 - accuracy: 0.8771 - val_loss: 1.1649 - val_accuracy: 0.7764
Epoch 29/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8156 - accuracy: 0.8837 - val_loss: 1.3089 - val_accuracy: 0.7377
Epoch 30/200
313/313 [==============================] - 65s 205ms/step - loss: 0.7889 - accuracy: 0.8903 - val_loss: 1.1975 - val_accuracy: 0.7685
Epoch 31/200
313/313 [==============================] - 65s 205ms/step - loss: 0.7671 - accuracy: 0.8957 - val_loss: 1.2099 - val_accuracy: 0.7647
Epoch 32/200
313/313 [==============================] - 65s 205ms/step - loss: 0.7418 - accuracy: 0.9043 - val_loss: 1.0759 - val_accuracy: 0.8037
Epoch 33/200
313/313 [==============================] - 65s 205ms/step - loss: 0.7097 - accuracy: 0.9162 - val_loss: 1.0745 - val_accuracy: 0.8019
Epoch 34/200
313/313 [==============================] - 65s 205ms/step - loss: 0.6934 - accuracy: 0.9185 - val_loss: 1.1344 - val_accuracy: 0.8054
Epoch 35/200
313/313 [==============================] - 65s 205ms/step - loss: 0.6692 - accuracy: 0.9255 - val_loss: 1.0895 - val_accuracy: 0.8020
Epoch 36/200
313/313 [==============================] - 65s 206ms/step - loss: 0.6396 - accuracy: 0.9361 - val_loss: 0.9024 - val_accuracy: 0.8565
Epoch 37/200
313/313 [==============================] - 65s 206ms/step - loss: 0.6162 - accuracy: 0.9429 - val_loss: 0.8929 - val_accuracy: 0.8604
Epoch 38/200
313/313 [==============================] - 65s 205ms/step - loss: 0.5810 - accuracy: 0.9519 - val_loss: 0.9448 - val_accuracy: 0.8468
Epoch 39/200
313/313 [==============================] - 65s 205ms/step - loss: 0.5509 - accuracy: 0.9571 - val_loss: 0.8832 - val_accuracy: 0.8583
Epoch 40/200
313/313 [==============================] - 65s 205ms/step - loss: 0.5220 - accuracy: 0.9645 - val_loss: 0.8742 - val_accuracy: 0.8623
Epoch 41/200
313/313 [==============================] - 65s 205ms/step - loss: 0.4913 - accuracy: 0.9697 - val_loss: 0.7709 - val_accuracy: 0.8938
Epoch 42/200
313/313 [==============================] - 65s 206ms/step - loss: 0.4599 - accuracy: 0.9763 - val_loss: 0.7776 - val_accuracy: 0.8851
Epoch 43/200
313/313 [==============================] - 65s 205ms/step - loss: 0.4277 - accuracy: 0.9819 - val_loss: 0.7599 - val_accuracy: 0.8855
Epoch 44/200
313/313 [==============================] - 65s 206ms/step - loss: 0.3961 - accuracy: 0.9873 - val_loss: 0.6869 - val_accuracy: 0.9045
Epoch 45/200
313/313 [==============================] - ETA: 0s - loss: 0.3676 - accuracy: 0.9916INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 88s 277ms/step - loss: 0.3676 - accuracy: 0.9916 - val_loss: 0.6358 - val_accuracy: 0.9121
Epoch 46/200
313/313 [==============================] - ETA: 0s - loss: 0.3473 - accuracy: 0.9941INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 85s 270ms/step - loss: 0.3473 - accuracy: 0.9941 - val_loss: 0.5995 - val_accuracy: 0.9192
Epoch 47/200
313/313 [==============================] - ETA: 0s - loss: 0.3303 - accuracy: 0.9951INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 88s 279ms/step - loss: 0.3303 - accuracy: 0.9951 - val_loss: 0.5885 - val_accuracy: 0.9243
Epoch 48/200
313/313 [==============================] - ETA: 0s - loss: 0.3171 - accuracy: 0.9967INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 272ms/step - loss: 0.3171 - accuracy: 0.9967 - val_loss: 0.5638 - val_accuracy: 0.9274
Epoch 49/200
313/313 [==============================] - ETA: 0s - loss: 0.3072 - accuracy: 0.9977INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 88s 279ms/step - loss: 0.3072 - accuracy: 0.9977 - val_loss: 0.5452 - val_accuracy: 0.9301
Epoch 50/200
313/313 [==============================] - ETA: 0s - loss: 0.3015 - accuracy: 0.9979INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 87s 275ms/step - loss: 0.3015 - accuracy: 0.9979 - val_loss: 0.5400 - val_accuracy: 0.9314
Epoch 51/200
313/313 [==============================] - ETA: 0s - loss: 0.2988 - accuracy: 0.9978INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 88s 278ms/step - loss: 0.2988 - accuracy: 0.9978 - val_loss: 0.5350 - val_accuracy: 0.9326
Epoch 52/200
313/313 [==============================] - 65s 206ms/step - loss: 0.2964 - accuracy: 0.9982 - val_loss: 0.5312 - val_accuracy: 0.9321
Epoch 53/200
313/313 [==============================] - ETA: 0s - loss: 0.2941 - accuracy: 0.9986INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug/assets
313/313 [==============================] - 86s 272ms/step - loss: 0.2941 - accuracy: 0.9986 - val_loss: 0.5315 - val_accuracy: 0.9333
Epoch 54/200
313/313 [==============================] - 65s 206ms/step - loss: 0.4289 - accuracy: 0.9575 - val_loss: 21.7207 - val_accuracy: 0.1660
Epoch 55/200
313/313 [==============================] - 65s 205ms/step - loss: 0.9306 - accuracy: 0.8374 - val_loss: 2.5987 - val_accuracy: 0.5058
Epoch 56/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8564 - accuracy: 0.8792 - val_loss: 1.1686 - val_accuracy: 0.7933
Epoch 57/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8477 - accuracy: 0.8920 - val_loss: 1.3515 - val_accuracy: 0.7610
Epoch 58/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8533 - accuracy: 0.8966 - val_loss: 1.1338 - val_accuracy: 0.8161
Epoch 59/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8461 - accuracy: 0.9028 - val_loss: 1.5255 - val_accuracy: 0.7540
Epoch 60/200
313/313 [==============================] - 65s 206ms/step - loss: 0.8436 - accuracy: 0.9063 - val_loss: 1.3053 - val_accuracy: 0.7768
Halting Training
Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_BasicDataAug
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
display(results)
fig.show()
Epochs 60 Batch Size 128 Model Name WideResNet_28_10_Fixed_BasicDataAug Remarks Model Path /content/drive/MyDrive/Data/DELE CA1/CIFAR10/S... Train Loss 0.294071 Test Loss 0.531471 Train Acc 0.998625 Test Acc 0.9333 [Train - Test] Acc 0.065325 dtype: object
We can observe that Wide ResNet is able to train very well, reaching 93% validation accuracy in less than 60 epochs. However, we see that the model heavily overfits.
I observe that even with basic data augmentation, dropout and L2 regularization, the model still heavily overfits.
To counteract this overfitting, I employ a SOTA data augmentation method known as CutMix.
It works by cutting out portions of other images, and pasting them over another image. In additon, the label will be modified to reflect that the image contains two classes. This helps the model learn to identify an object from a partial view of it, improving generalization.
Implementation adapted from: https://keras.io/examples/vision/cutmix/
def sample_beta_distribution(size, concentration_0=0.2, concentration_1=0.2):
gamma_1_sample = tf.random.gamma(shape=[size], alpha=concentration_1)
gamma_2_sample = tf.random.gamma(shape=[size], alpha=concentration_0)
return gamma_1_sample / (gamma_1_sample + gamma_2_sample)
@tf.function
def get_box(lambda_value):
cut_rat = tf.math.sqrt(1.0 - lambda_value)
image_wh = IMG_SIZE[0]
cut_wh = image_wh * cut_rat # rw
cut_wh = tf.cast(cut_wh, tf.int32)
cut_x = tf.random.uniform((1,), minval=0, maxval=image_wh, dtype=tf.int32) # rx
cut_y = tf.random.uniform((1,), minval=0, maxval=image_wh, dtype=tf.int32) # ry
boundaryx1 = tf.clip_by_value(cut_x[0] - cut_wh // 2, 0, image_wh)
boundaryy1 = tf.clip_by_value(cut_y[0] - cut_wh // 2, 0, image_wh)
bbx2 = tf.clip_by_value(cut_x[0] + cut_wh // 2, 0, image_wh)
bby2 = tf.clip_by_value(cut_y[0] + cut_wh // 2, 0, image_wh)
target_h = bby2 - boundaryy1
if target_h == 0:
target_h += 1
target_w = bbx2 - boundaryx1
if target_w == 0:
target_w += 1
return boundaryx1, boundaryy1, target_h, target_w
@tf.function
def cutmix(train_ds_one, train_ds_two):
(image1, label1), (image2, label2) = train_ds_one, train_ds_two
image_size = IMG_SIZE[0]
alpha = [1]
beta = [1]
# Get a sample from the Beta distribution
lambda_value = sample_beta_distribution(1, alpha, beta)
# Define Lambda
lambda_value = lambda_value[0][0]
# Get the bounding box offsets, heights and widths
boundaryx1, boundaryy1, target_h, target_w = get_box(lambda_value)
# Get a patch from the second image (`image2`)
crop2 = tf.image.crop_to_bounding_box(
image2, boundaryy1, boundaryx1, target_h, target_w
)
# Pad the `image2` patch (`crop2`) with the same offset
image2 = tf.image.pad_to_bounding_box(
crop2, boundaryy1, boundaryx1, image_size, image_size
)
# Get a patch from the first image (`image1`)
crop1 = tf.image.crop_to_bounding_box(
image1, boundaryy1, boundaryx1, target_h, target_w
)
# Pad the `image1` patch (`crop1`) with the same offset
img1 = tf.image.pad_to_bounding_box(
crop1, boundaryy1, boundaryx1, image_size, image_size
)
# Modify the first image by subtracting the patch from `image1`
# (before applying the `image2` patch)
image1 = image1 - img1
# Add the modified `image1` and `image2` together to get the CutMix image
image = image1 + image2
# Adjust Lambda in accordance to the pixel ration
lambda_value = 1 - (target_w * target_h) / (image_size * image_size)
lambda_value = tf.cast(lambda_value, tf.float32)
# Combine the labels of both images
label = lambda_value * label1 + (1 - lambda_value) * label2
return image, label
def set_up_cutmix():
train_ds_one = tf.data.Dataset.from_tensor_slices((X_train, y_train)).shuffle(2048)
train_ds_two = tf.data.Dataset.from_tensor_slices((X_train, y_train)).shuffle(2048)
val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val)).shuffle(2048).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
train_ds = tf.data.Dataset.zip((train_ds_one, train_ds_two))
train_ds_cutmix = (
train_ds.shuffle(1024)
.map(cutmix, num_parallel_calls=tf.data.AUTOTUNE)
.batch(BATCH_SIZE)
.prefetch(tf.data.AUTOTUNE)
)
return train_ds_cutmix, val_ds
train_cutmix_ds, val_ds = set_up_cutmix()
image_batch, label_batch = next(iter(train_cutmix_ds))
plt.figure(figsize=(10, 10))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.title(class_labels[np.argmax(label_batch[i])])
plt.imshow(tf.squeeze(image_batch[i]))
plt.axis("off")
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
train_cutmix_ds, val_ds = set_up_cutmix()
model = build_wideresnet(optimizer, name="WideResNet_28_10_Fixed_CutMix")
results, fig = evaluator.evaluate_model(model, training_data=train_cutmix_ds, validation_data=val_ds,callbacks=callbacks)
evaluator.save_history()
Model: "WideResNet_28_10_Fixed_CutMix"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 32, 32, 3)] 0
normalization (Normalizatio (None, 32, 32, 3) 7
n)
conv2d_31 (Conv2D) (None, 32, 32, 16) 432
wide_resnet_block_3 (WideRe (None, 32, 32, 160) 1719744
snetBlock)
wide_resnet_block_4 (WideRe (None, 16, 16, 320) 6972800
snetBlock)
wide_resnet_block_5 (WideRe (None, 8, 8, 640) 27872000
snetBlock)
batch_normalization_49 (Bat (None, 8, 8, 640) 2560
chNormalization)
re_lu_1 (ReLU) (None, 8, 8, 640) 0
global_average_pooling2d_1 (None, 640) 0
(GlobalAveragePooling2D)
dense_1 (Dense) (None, 10) 6410
=================================================================
Total params: 36,573,953
Trainable params: 36,555,994
Non-trainable params: 17,959
_________________________________________________________________
None
Training WideResNet_28_10_Fixed_CutMix
Epoch 1/200
313/313 [==============================] - ETA: 0s - loss: 11.4090 - accuracy: 0.1981INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 91s 278ms/step - loss: 11.4090 - accuracy: 0.1981 - val_loss: 9.9617 - val_accuracy: 0.2438
Epoch 2/200
313/313 [==============================] - 65s 207ms/step - loss: 8.8816 - accuracy: 0.2587 - val_loss: 8.3595 - val_accuracy: 0.1412
Epoch 3/200
313/313 [==============================] - ETA: 0s - loss: 7.0215 - accuracy: 0.3122INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 85s 273ms/step - loss: 7.0215 - accuracy: 0.3122 - val_loss: 6.2497 - val_accuracy: 0.3059
Epoch 4/200
313/313 [==============================] - ETA: 0s - loss: 5.6004 - accuracy: 0.3907INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 281ms/step - loss: 5.6004 - accuracy: 0.3907 - val_loss: 5.0512 - val_accuracy: 0.3565
Epoch 5/200
313/313 [==============================] - ETA: 0s - loss: 4.5187 - accuracy: 0.4748INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 275ms/step - loss: 4.5187 - accuracy: 0.4748 - val_loss: 3.9538 - val_accuracy: 0.4525
Epoch 6/200
313/313 [==============================] - 65s 206ms/step - loss: 3.7367 - accuracy: 0.5263 - val_loss: 4.2787 - val_accuracy: 0.2611
Epoch 7/200
313/313 [==============================] - ETA: 0s - loss: 3.1777 - accuracy: 0.5531INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 89s 285ms/step - loss: 3.1777 - accuracy: 0.5531 - val_loss: 2.6988 - val_accuracy: 0.5857
Epoch 8/200
313/313 [==============================] - ETA: 0s - loss: 2.7518 - accuracy: 0.5839INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 278ms/step - loss: 2.7518 - accuracy: 0.5839 - val_loss: 2.1109 - val_accuracy: 0.6787
Epoch 9/200
313/313 [==============================] - ETA: 0s - loss: 2.4368 - accuracy: 0.6131INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 89s 286ms/step - loss: 2.4368 - accuracy: 0.6131 - val_loss: 1.7032 - val_accuracy: 0.7403
Epoch 10/200
313/313 [==============================] - 65s 208ms/step - loss: 2.1900 - accuracy: 0.6403 - val_loss: 1.6448 - val_accuracy: 0.7005
Epoch 11/200
313/313 [==============================] - 65s 207ms/step - loss: 2.0130 - accuracy: 0.6569 - val_loss: 1.4603 - val_accuracy: 0.7310
Epoch 12/200
313/313 [==============================] - ETA: 0s - loss: 1.8738 - accuracy: 0.6744INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 275ms/step - loss: 1.8738 - accuracy: 0.6744 - val_loss: 1.2980 - val_accuracy: 0.7617
Epoch 13/200
313/313 [==============================] - 65s 207ms/step - loss: 1.7670 - accuracy: 0.6865 - val_loss: 1.2762 - val_accuracy: 0.7415
Epoch 14/200
313/313 [==============================] - ETA: 0s - loss: 1.6910 - accuracy: 0.6937INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 1.6910 - accuracy: 0.6937 - val_loss: 1.1348 - val_accuracy: 0.7794
Epoch 15/200
313/313 [==============================] - ETA: 0s - loss: 1.6251 - accuracy: 0.7049INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 280ms/step - loss: 1.6251 - accuracy: 0.7049 - val_loss: 1.0962 - val_accuracy: 0.7837
Epoch 16/200
313/313 [==============================] - 65s 207ms/step - loss: 1.5724 - accuracy: 0.7125 - val_loss: 1.1180 - val_accuracy: 0.7755
Epoch 17/200
313/313 [==============================] - ETA: 0s - loss: 1.5347 - accuracy: 0.7218INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 280ms/step - loss: 1.5347 - accuracy: 0.7218 - val_loss: 0.9991 - val_accuracy: 0.7946
Epoch 18/200
313/313 [==============================] - ETA: 0s - loss: 1.4974 - accuracy: 0.7301INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 281ms/step - loss: 1.4974 - accuracy: 0.7301 - val_loss: 0.9637 - val_accuracy: 0.8020
Epoch 19/200
313/313 [==============================] - 65s 207ms/step - loss: 1.4666 - accuracy: 0.7401 - val_loss: 0.9841 - val_accuracy: 0.7909
Epoch 20/200
313/313 [==============================] - ETA: 0s - loss: 1.4372 - accuracy: 0.7454INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 1.4372 - accuracy: 0.7454 - val_loss: 0.8917 - val_accuracy: 0.8236
Epoch 21/200
313/313 [==============================] - ETA: 0s - loss: 1.4197 - accuracy: 0.7526INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 280ms/step - loss: 1.4197 - accuracy: 0.7526 - val_loss: 0.8855 - val_accuracy: 0.8273
Epoch 22/200
313/313 [==============================] - ETA: 0s - loss: 1.3947 - accuracy: 0.7657INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 277ms/step - loss: 1.3947 - accuracy: 0.7657 - val_loss: 0.8258 - val_accuracy: 0.8463
Epoch 23/200
313/313 [==============================] - 65s 208ms/step - loss: 1.3777 - accuracy: 0.7691 - val_loss: 0.9028 - val_accuracy: 0.8243
Epoch 24/200
313/313 [==============================] - 65s 207ms/step - loss: 1.3632 - accuracy: 0.7732 - val_loss: 0.8866 - val_accuracy: 0.8232
Epoch 25/200
313/313 [==============================] - 65s 207ms/step - loss: 1.3454 - accuracy: 0.7835 - val_loss: 0.9740 - val_accuracy: 0.7976
Epoch 26/200
313/313 [==============================] - ETA: 0s - loss: 1.3275 - accuracy: 0.7898INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 282ms/step - loss: 1.3275 - accuracy: 0.7898 - val_loss: 0.7111 - val_accuracy: 0.8842
Epoch 27/200
313/313 [==============================] - 65s 206ms/step - loss: 1.3099 - accuracy: 0.7959 - val_loss: 0.7448 - val_accuracy: 0.8707
Epoch 28/200
313/313 [==============================] - 65s 207ms/step - loss: 1.2986 - accuracy: 0.8001 - val_loss: 0.7357 - val_accuracy: 0.8811
Epoch 29/200
313/313 [==============================] - 65s 207ms/step - loss: 1.2824 - accuracy: 0.8059 - val_loss: 0.8312 - val_accuracy: 0.8515
Epoch 30/200
313/313 [==============================] - ETA: 0s - loss: 1.2636 - accuracy: 0.8141INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 279ms/step - loss: 1.2636 - accuracy: 0.8141 - val_loss: 0.7098 - val_accuracy: 0.8894
Epoch 31/200
313/313 [==============================] - ETA: 0s - loss: 1.2503 - accuracy: 0.8198INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 1.2503 - accuracy: 0.8198 - val_loss: 0.7088 - val_accuracy: 0.8935
Epoch 32/200
313/313 [==============================] - 65s 207ms/step - loss: 1.2271 - accuracy: 0.8254 - val_loss: 0.7023 - val_accuracy: 0.8883
Epoch 33/200
313/313 [==============================] - ETA: 0s - loss: 1.2114 - accuracy: 0.8327INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 277ms/step - loss: 1.2114 - accuracy: 0.8327 - val_loss: 0.6927 - val_accuracy: 0.8943
Epoch 34/200
313/313 [==============================] - ETA: 0s - loss: 1.1993 - accuracy: 0.8353INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 277ms/step - loss: 1.1993 - accuracy: 0.8353 - val_loss: 0.6534 - val_accuracy: 0.9057
Epoch 35/200
313/313 [==============================] - 65s 207ms/step - loss: 1.1795 - accuracy: 0.8441 - val_loss: 0.6949 - val_accuracy: 0.8970
Epoch 36/200
313/313 [==============================] - ETA: 0s - loss: 1.1626 - accuracy: 0.8493INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 281ms/step - loss: 1.1626 - accuracy: 0.8493 - val_loss: 0.6580 - val_accuracy: 0.9079
Epoch 37/200
313/313 [==============================] - ETA: 0s - loss: 1.1445 - accuracy: 0.8524INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 85s 273ms/step - loss: 1.1445 - accuracy: 0.8524 - val_loss: 0.6374 - val_accuracy: 0.9089
Epoch 38/200
313/313 [==============================] - ETA: 0s - loss: 1.1260 - accuracy: 0.8585INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 278ms/step - loss: 1.1260 - accuracy: 0.8585 - val_loss: 0.6252 - val_accuracy: 0.9144
Epoch 39/200
313/313 [==============================] - ETA: 0s - loss: 1.1047 - accuracy: 0.8647INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 277ms/step - loss: 1.1047 - accuracy: 0.8647 - val_loss: 0.6166 - val_accuracy: 0.9172
Epoch 40/200
313/313 [==============================] - ETA: 0s - loss: 1.0845 - accuracy: 0.8722INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 1.0845 - accuracy: 0.8722 - val_loss: 0.6015 - val_accuracy: 0.9178
Epoch 41/200
313/313 [==============================] - ETA: 0s - loss: 1.0610 - accuracy: 0.8755INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 1.0610 - accuracy: 0.8755 - val_loss: 0.5895 - val_accuracy: 0.9190
Epoch 42/200
313/313 [==============================] - 65s 208ms/step - loss: 1.0460 - accuracy: 0.8810 - val_loss: 0.5838 - val_accuracy: 0.9179
Epoch 43/200
313/313 [==============================] - ETA: 0s - loss: 1.0293 - accuracy: 0.8862INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 1.0293 - accuracy: 0.8862 - val_loss: 0.5701 - val_accuracy: 0.9270
Epoch 44/200
313/313 [==============================] - 65s 208ms/step - loss: 1.0109 - accuracy: 0.8906 - val_loss: 0.5675 - val_accuracy: 0.9250
Epoch 45/200
313/313 [==============================] - ETA: 0s - loss: 0.9948 - accuracy: 0.8952INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 280ms/step - loss: 0.9948 - accuracy: 0.8952 - val_loss: 0.5494 - val_accuracy: 0.9318
Epoch 46/200
313/313 [==============================] - ETA: 0s - loss: 0.9831 - accuracy: 0.8982INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 277ms/step - loss: 0.9831 - accuracy: 0.8982 - val_loss: 0.5324 - val_accuracy: 0.9325
Epoch 47/200
313/313 [==============================] - ETA: 0s - loss: 0.9650 - accuracy: 0.9013INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 283ms/step - loss: 0.9650 - accuracy: 0.9013 - val_loss: 0.5353 - val_accuracy: 0.9331
Epoch 48/200
313/313 [==============================] - ETA: 0s - loss: 0.9535 - accuracy: 0.9058INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 278ms/step - loss: 0.9535 - accuracy: 0.9058 - val_loss: 0.5180 - val_accuracy: 0.9355
Epoch 49/200
313/313 [==============================] - 65s 209ms/step - loss: 0.9414 - accuracy: 0.9091 - val_loss: 0.5171 - val_accuracy: 0.9339
Epoch 50/200
313/313 [==============================] - ETA: 0s - loss: 0.9368 - accuracy: 0.9097INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 87s 277ms/step - loss: 0.9368 - accuracy: 0.9097 - val_loss: 0.5104 - val_accuracy: 0.9366
Epoch 51/200
313/313 [==============================] - ETA: 0s - loss: 0.9259 - accuracy: 0.9112INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 0.9259 - accuracy: 0.9112 - val_loss: 0.5062 - val_accuracy: 0.9375
Epoch 52/200
313/313 [==============================] - ETA: 0s - loss: 0.9182 - accuracy: 0.9143INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 88s 282ms/step - loss: 0.9182 - accuracy: 0.9143 - val_loss: 0.5043 - val_accuracy: 0.9376
Epoch 53/200
313/313 [==============================] - ETA: 0s - loss: 0.9144 - accuracy: 0.9160INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix/assets
313/313 [==============================] - 86s 276ms/step - loss: 0.9144 - accuracy: 0.9160 - val_loss: 0.5017 - val_accuracy: 0.9391
Epoch 54/200
313/313 [==============================] - 65s 209ms/step - loss: 0.9124 - accuracy: 0.9156 - val_loss: 0.5005 - val_accuracy: 0.9379
Epoch 55/200
313/313 [==============================] - 65s 208ms/step - loss: 0.9098 - accuracy: 0.9182 - val_loss: 0.4987 - val_accuracy: 0.9383
Epoch 56/200
313/313 [==============================] - 65s 208ms/step - loss: 0.9080 - accuracy: 0.9155 - val_loss: 0.4991 - val_accuracy: 0.9385
Epoch 57/200
313/313 [==============================] - 65s 208ms/step - loss: 1.1532 - accuracy: 0.8317 - val_loss: 1.9981 - val_accuracy: 0.4750
Epoch 58/200
313/313 [==============================] - 65s 209ms/step - loss: 1.5826 - accuracy: 0.7160 - val_loss: 1.0300 - val_accuracy: 0.8086
Epoch 59/200
313/313 [==============================] - 65s 209ms/step - loss: 1.5466 - accuracy: 0.7403 - val_loss: 1.0454 - val_accuracy: 0.8141
Epoch 60/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5303 - accuracy: 0.7573 - val_loss: 1.0579 - val_accuracy: 0.8044
Epoch 61/200
313/313 [==============================] - 65s 209ms/step - loss: 1.5235 - accuracy: 0.7561 - val_loss: 1.0476 - val_accuracy: 0.8070
Epoch 62/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5179 - accuracy: 0.7614 - val_loss: 0.9848 - val_accuracy: 0.8372
Epoch 63/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5117 - accuracy: 0.7684 - val_loss: 0.9821 - val_accuracy: 0.8370
Epoch 64/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5101 - accuracy: 0.7678 - val_loss: 0.9856 - val_accuracy: 0.8440
Epoch 65/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5041 - accuracy: 0.7738 - val_loss: 0.9883 - val_accuracy: 0.8459
Epoch 66/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5055 - accuracy: 0.7730 - val_loss: 1.0632 - val_accuracy: 0.8174
Epoch 67/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5058 - accuracy: 0.7734 - val_loss: 1.0691 - val_accuracy: 0.8125
Epoch 68/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4989 - accuracy: 0.7749 - val_loss: 0.9990 - val_accuracy: 0.8376
Epoch 69/200
313/313 [==============================] - 65s 209ms/step - loss: 1.5121 - accuracy: 0.7723 - val_loss: 0.9964 - val_accuracy: 0.8404
Epoch 70/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4984 - accuracy: 0.7802 - val_loss: 1.0156 - val_accuracy: 0.8383
Epoch 71/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5016 - accuracy: 0.7822 - val_loss: 1.0748 - val_accuracy: 0.8220
Epoch 72/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5043 - accuracy: 0.7804 - val_loss: 1.1419 - val_accuracy: 0.7885
Epoch 73/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4970 - accuracy: 0.7866 - val_loss: 0.9901 - val_accuracy: 0.8496
Epoch 74/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5011 - accuracy: 0.7831 - val_loss: 0.9137 - val_accuracy: 0.8745
Epoch 75/200
313/313 [==============================] - 65s 208ms/step - loss: 1.5030 - accuracy: 0.7827 - val_loss: 1.0809 - val_accuracy: 0.8223
Epoch 76/200
313/313 [==============================] - 65s 209ms/step - loss: 1.5056 - accuracy: 0.7832 - val_loss: 1.0134 - val_accuracy: 0.8427
Epoch 77/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4977 - accuracy: 0.7874 - val_loss: 1.0314 - val_accuracy: 0.8412
Epoch 78/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4935 - accuracy: 0.7872 - val_loss: 1.0269 - val_accuracy: 0.8425
Epoch 79/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4985 - accuracy: 0.7887 - val_loss: 0.9391 - val_accuracy: 0.8682
Epoch 80/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4992 - accuracy: 0.7916 - val_loss: 1.0200 - val_accuracy: 0.8487
Epoch 81/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4956 - accuracy: 0.7929 - val_loss: 0.9740 - val_accuracy: 0.8592
Epoch 82/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4884 - accuracy: 0.7926 - val_loss: 0.9559 - val_accuracy: 0.8691
Epoch 83/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4982 - accuracy: 0.7948 - val_loss: 0.9754 - val_accuracy: 0.8644
Epoch 84/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4857 - accuracy: 0.7947 - val_loss: 1.2134 - val_accuracy: 0.7952
Epoch 85/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4866 - accuracy: 0.7965 - val_loss: 1.0099 - val_accuracy: 0.8487
Epoch 86/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4846 - accuracy: 0.7992 - val_loss: 1.0843 - val_accuracy: 0.8303
Epoch 87/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4826 - accuracy: 0.7985 - val_loss: 1.0459 - val_accuracy: 0.8388
Epoch 88/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4752 - accuracy: 0.8033 - val_loss: 1.1337 - val_accuracy: 0.8109
Epoch 89/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4826 - accuracy: 0.7965 - val_loss: 0.9135 - val_accuracy: 0.8857
Epoch 90/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4686 - accuracy: 0.8020 - val_loss: 1.0800 - val_accuracy: 0.8314
Epoch 91/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4699 - accuracy: 0.8039 - val_loss: 1.0327 - val_accuracy: 0.8462
Epoch 92/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4665 - accuracy: 0.8050 - val_loss: 0.9147 - val_accuracy: 0.8852
Epoch 93/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4733 - accuracy: 0.8037 - val_loss: 0.9294 - val_accuracy: 0.8763
Epoch 94/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4638 - accuracy: 0.8069 - val_loss: 0.8888 - val_accuracy: 0.8905
Epoch 95/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4556 - accuracy: 0.8112 - val_loss: 0.9623 - val_accuracy: 0.8743
Epoch 96/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4613 - accuracy: 0.8066 - val_loss: 0.9035 - val_accuracy: 0.8840
Epoch 97/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4533 - accuracy: 0.8114 - val_loss: 0.9178 - val_accuracy: 0.8861
Epoch 98/200
313/313 [==============================] - 65s 208ms/step - loss: 1.4569 - accuracy: 0.8103 - val_loss: 0.9189 - val_accuracy: 0.8804
Epoch 99/200
313/313 [==============================] - 65s 209ms/step - loss: 1.4538 - accuracy: 0.8115 - val_loss: 0.9360 - val_accuracy: 0.8764
Epoch 100/200
307/313 [============================>.] - ETA: 1s - loss: 1.4461 - accuracy: 0.8164
Halting Training
Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/WideResNet_28_10_Fixed_CutMix
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
We stop the training at 100 epochs since the model appears to stop improving.
evaluator.return_history()
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
| 2 | Baseline_CNN_1_DataAug | 0.996625 | 0.8826 | 0.114025 | Data Aug lowers variance, but still overfits |
| 3 | efficientnetv2-s | 0.320450 | 0.3495 | -0.029050 | NaN |
| 4 | WideResNet_28_10_BasicDataAug | 0.982225 | 0.9072 | 0.075025 | NaN |
| 5 | ImprovedWideResNet_28_10 | 0.743250 | 0.7258 | 0.017450 | NaN |
| 6 | ImprovedWideResNet_28_10_No_Stochastic_Depth | 0.879075 | 0.7844 | 0.094675 | NaN |
| 7 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.997800 | 0.8739 | 0.123900 | NaN |
| 8 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.999425 | 0.9398 | 0.059625 | NaN |
| 9 | ImprovedWideResNet_28_10_ProperDropout_No_Stoc... | 0.998900 | 0.9456 | 0.053300 | Best model thus far |
| 10 | ImprovedWideResNet_28_10_Dropout_No_Stochastic... | 0.999850 | 0.9170 | 0.082850 | NaN |
| 11 | WideResNet_28_10_Fixed_BasicDataAug | 0.998625 | 0.9333 | 0.065325 | |
| 12 | WideResNet_28_10_Fixed_CutMix | 0.916025 | 0.9391 | -0.023075 |
Impressively, Cutmix seems to allow us to reach the same level of performance, while greatly reducing overfitting.
class SEBlock(Model):
"""
Implementation of squeeze and excite block based on the original paper
"""
def __init__(self, channels, reduction_ratio=16, activation='relu'):
super(SEBlock, self).__init__()
self.squeeze = GlobalAveragePooling2D()
self.excite_1 = Dense(channels // reduction_ratio, activation=activation, kernel_initializer='he_normal', kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False)
self.excite_2 = Dense(channels, activation='sigmoid', kernel_initializer='he_normal', kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False)
def call(self, X):
Y = self.squeeze(X)
Y = self.excite_1(Y)
Y = self.excite_2(Y)
return Multiply()([X, Y])
# code adapted for Keras https://github.com/szagoruyko/wide-residual-networks/blob/master/models/wide-resnet.lua
class SEWideResNetLayer(Model):
"""
B(3,3) block was found to perform the best
Dropout is added between the convolutions, after activations
Order of BN, Conv and ReLU changed to BN-Relu-Conv
"""
def __init__(self, num_channels, k, use_1x1conv=False, strides=1, activation='relu', dropout=0.3):
super(SEWideResNetLayer, self).__init__() # subclassing a layer
self.activation = Activation(activation)
self.bn_1 = BatchNormalization(epsilon=1e-5, gamma_initializer='uniform') #
self.conv_1 = Conv2D(num_channels * k, 3, strides=strides, padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')
self.dropout = Dropout(dropout) # 0.3 was found to be optimal for CIFAR-10
self.bn_2 = BatchNormalization(epsilon=1e-5, gamma_initializer='uniform')
self.conv_2 = Conv2D(num_channels * k, 3, strides=1, padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')
if use_1x1conv:
self.conv3 = Conv2D(num_channels * k, 1, strides=strides,padding="same", kernel_regularizer=l2(WEIGHT_DECAY), use_bias=False, kernel_initializer='he_normal')
else:
self.conv3 = None
self.se = SEBlock(num_channels * k, activation=activation)
def call(self, X):
Y = self.bn_1(X)
Y = self.activation(Y)
Y = self.conv_1(Y)
Y = self.bn_2(Y)
Y = self.activation(Y)
Y = self.dropout(Y)
Y = self.conv_2(Y)
Y = self.se(Y)
if self.conv3 is not None:
X = self.conv3(X) # shortcut connection
Y += X
return Y
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
train_cutmix_ds, val_ds = set_up_cutmix()
model = build_wideresnet(optimizer, name="SEWRN_28_10_Fixed_Cutmix", layer=SEWideResNetLayer)
results, fig = evaluator.evaluate_model(model, training_data=train_cutmix_ds, validation_data=val_ds,callbacks=callbacks)
evaluator.save_history()
Model: "SEWRN_28_10_Fixed_Cutmix"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 32, 32, 3)] 0
normalization (Normalizatio (None, 32, 32, 3) 7
n)
conv2d_13 (Conv2D) (None, 32, 32, 16) 432
wide_resnet_block_1 (WideRe (None, 32, 32, 160) 1732544
snetBlock)
wide_resnet_block_2 (WideRe (None, 16, 16, 320) 7024000
snetBlock)
wide_resnet_block_3 (WideRe (None, 8, 8, 640) 28076800
snetBlock)
batch_normalization_32 (Bat (None, 8, 8, 640) 2560
chNormalization)
activation_15 (Activation) (None, 8, 8, 640) 0
global_average_pooling2d_16 (None, 640) 0
(GlobalAveragePooling2D)
dense_32 (Dense) (None, 10) 6410
=================================================================
Total params: 36,842,753
Trainable params: 36,824,794
Non-trainable params: 17,959
_________________________________________________________________
None
Training SEWRN_28_10_Fixed_Cutmix
Epoch 1/200
313/313 [==============================] - ETA: 0s - loss: 15.4834 - accuracy: 0.1976INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 92s 216ms/step - loss: 15.4834 - accuracy: 0.1976 - val_loss: 13.3981 - val_accuracy: 0.2332
Epoch 2/200
313/313 [==============================] - ETA: 0s - loss: 11.9443 - accuracy: 0.2418INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 211ms/step - loss: 11.9443 - accuracy: 0.2418 - val_loss: 10.4764 - val_accuracy: 0.2801
Epoch 3/200
313/313 [==============================] - ETA: 0s - loss: 9.5324 - accuracy: 0.2790INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 211ms/step - loss: 9.5324 - accuracy: 0.2790 - val_loss: 8.5421 - val_accuracy: 0.2862
Epoch 4/200
313/313 [==============================] - 39s 126ms/step - loss: 7.9712 - accuracy: 0.3288 - val_loss: 7.3952 - val_accuracy: 0.2762
Epoch 5/200
313/313 [==============================] - ETA: 0s - loss: 6.9642 - accuracy: 0.4014INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 212ms/step - loss: 6.9642 - accuracy: 0.4014 - val_loss: 6.5314 - val_accuracy: 0.3776
Epoch 6/200
313/313 [==============================] - ETA: 0s - loss: 6.3558 - accuracy: 0.4712INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 211ms/step - loss: 6.3558 - accuracy: 0.4712 - val_loss: 5.9384 - val_accuracy: 0.4844
Epoch 7/200
313/313 [==============================] - ETA: 0s - loss: 6.0363 - accuracy: 0.5212INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 211ms/step - loss: 6.0363 - accuracy: 0.5212 - val_loss: 5.6278 - val_accuracy: 0.5601
Epoch 8/200
313/313 [==============================] - 39s 125ms/step - loss: 5.9251 - accuracy: 0.5463 - val_loss: 6.3929 - val_accuracy: 0.4195
Epoch 9/200
313/313 [==============================] - 39s 125ms/step - loss: 5.5021 - accuracy: 0.4588 - val_loss: 4.6048 - val_accuracy: 0.5072
Epoch 10/200
313/313 [==============================] - ETA: 0s - loss: 4.3930 - accuracy: 0.5208INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 4.3930 - accuracy: 0.5208 - val_loss: 3.5102 - val_accuracy: 0.6239
Epoch 11/200
313/313 [==============================] - 39s 125ms/step - loss: 3.6175 - accuracy: 0.5613 - val_loss: 3.3394 - val_accuracy: 0.5311
Epoch 12/200
313/313 [==============================] - ETA: 0s - loss: 3.0785 - accuracy: 0.5889INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 3.0785 - accuracy: 0.5889 - val_loss: 2.3189 - val_accuracy: 0.6975
Epoch 13/200
313/313 [==============================] - ETA: 0s - loss: 2.6677 - accuracy: 0.6218INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 2.6677 - accuracy: 0.6218 - val_loss: 2.0322 - val_accuracy: 0.7017
Epoch 14/200
313/313 [==============================] - ETA: 0s - loss: 2.3811 - accuracy: 0.6436INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 2.3811 - accuracy: 0.6436 - val_loss: 1.7889 - val_accuracy: 0.7234
Epoch 15/200
313/313 [==============================] - ETA: 0s - loss: 2.1642 - accuracy: 0.6692INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 2.1642 - accuracy: 0.6692 - val_loss: 1.4224 - val_accuracy: 0.8117
Epoch 16/200
313/313 [==============================] - ETA: 0s - loss: 2.0069 - accuracy: 0.6876INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 2.0069 - accuracy: 0.6876 - val_loss: 1.3139 - val_accuracy: 0.8183
Epoch 17/200
313/313 [==============================] - 39s 125ms/step - loss: 1.8770 - accuracy: 0.7078 - val_loss: 1.2726 - val_accuracy: 0.7962
Epoch 18/200
313/313 [==============================] - ETA: 0s - loss: 1.7835 - accuracy: 0.7260INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 1.7835 - accuracy: 0.7260 - val_loss: 1.1139 - val_accuracy: 0.8410
Epoch 19/200
313/313 [==============================] - ETA: 0s - loss: 1.6882 - accuracy: 0.7492INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 1.6882 - accuracy: 0.7492 - val_loss: 1.0685 - val_accuracy: 0.8521
Epoch 20/200
313/313 [==============================] - ETA: 0s - loss: 1.6104 - accuracy: 0.7664INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 1.6104 - accuracy: 0.7664 - val_loss: 0.9207 - val_accuracy: 0.8878
Epoch 21/200
313/313 [==============================] - ETA: 0s - loss: 1.5404 - accuracy: 0.7856INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 1.5404 - accuracy: 0.7856 - val_loss: 0.8776 - val_accuracy: 0.8966
Epoch 22/200
313/313 [==============================] - ETA: 0s - loss: 1.4881 - accuracy: 0.8046INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 1.4881 - accuracy: 0.8046 - val_loss: 0.8543 - val_accuracy: 0.9065
Epoch 23/200
313/313 [==============================] - ETA: 0s - loss: 1.4490 - accuracy: 0.8169INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 1.4490 - accuracy: 0.8169 - val_loss: 0.8218 - val_accuracy: 0.9152
Epoch 24/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4393 - accuracy: 0.8207 - val_loss: 2.0849 - val_accuracy: 0.4864
Epoch 25/200
313/313 [==============================] - 39s 125ms/step - loss: 1.8184 - accuracy: 0.6625 - val_loss: 1.2255 - val_accuracy: 0.7694
Epoch 26/200
313/313 [==============================] - 39s 125ms/step - loss: 1.7108 - accuracy: 0.6901 - val_loss: 1.1985 - val_accuracy: 0.7403
Epoch 27/200
313/313 [==============================] - 39s 125ms/step - loss: 1.6376 - accuracy: 0.7041 - val_loss: 1.3967 - val_accuracy: 0.6758
Epoch 28/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5922 - accuracy: 0.7096 - val_loss: 1.0343 - val_accuracy: 0.7878
Epoch 29/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5560 - accuracy: 0.7166 - val_loss: 0.9995 - val_accuracy: 0.8047
Epoch 30/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5204 - accuracy: 0.7265 - val_loss: 1.1395 - val_accuracy: 0.7529
Epoch 31/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4963 - accuracy: 0.7330 - val_loss: 0.9705 - val_accuracy: 0.8109
Epoch 32/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4746 - accuracy: 0.7411 - val_loss: 0.9781 - val_accuracy: 0.8120
Epoch 33/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4595 - accuracy: 0.7452 - val_loss: 0.8825 - val_accuracy: 0.8360
Epoch 34/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4316 - accuracy: 0.7545 - val_loss: 0.8161 - val_accuracy: 0.8623
Epoch 35/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4207 - accuracy: 0.7606 - val_loss: 0.8484 - val_accuracy: 0.8482
Epoch 36/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4037 - accuracy: 0.7674 - val_loss: 0.9455 - val_accuracy: 0.8174
Epoch 37/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3877 - accuracy: 0.7724 - val_loss: 0.8390 - val_accuracy: 0.8483
Epoch 38/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3640 - accuracy: 0.7833 - val_loss: 0.7881 - val_accuracy: 0.8758
Epoch 39/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3377 - accuracy: 0.7949 - val_loss: 0.7847 - val_accuracy: 0.8761
Epoch 40/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3222 - accuracy: 0.7997 - val_loss: 0.8326 - val_accuracy: 0.8546
Epoch 41/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2985 - accuracy: 0.8101 - val_loss: 0.6920 - val_accuracy: 0.8982
Epoch 42/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2742 - accuracy: 0.8140 - val_loss: 0.7444 - val_accuracy: 0.8810
Epoch 43/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2431 - accuracy: 0.8279 - val_loss: 0.7131 - val_accuracy: 0.8916
Epoch 44/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2150 - accuracy: 0.8356 - val_loss: 0.6612 - val_accuracy: 0.9071
Epoch 45/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1809 - accuracy: 0.8466 - val_loss: 0.6474 - val_accuracy: 0.9130
Epoch 46/200
313/313 [==============================] - ETA: 0s - loss: 1.1571 - accuracy: 0.8510INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 1.1571 - accuracy: 0.8510 - val_loss: 0.6085 - val_accuracy: 0.9195
Epoch 47/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1229 - accuracy: 0.8632 - val_loss: 0.6045 - val_accuracy: 0.9184
Epoch 48/200
313/313 [==============================] - ETA: 0s - loss: 1.0941 - accuracy: 0.8704INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 1.0941 - accuracy: 0.8704 - val_loss: 0.5790 - val_accuracy: 0.9272
Epoch 49/200
313/313 [==============================] - ETA: 0s - loss: 1.0626 - accuracy: 0.8788INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 1.0626 - accuracy: 0.8788 - val_loss: 0.5564 - val_accuracy: 0.9322
Epoch 50/200
313/313 [==============================] - ETA: 0s - loss: 1.0365 - accuracy: 0.8850INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 1.0365 - accuracy: 0.8850 - val_loss: 0.5365 - val_accuracy: 0.9367
Epoch 51/200
313/313 [==============================] - ETA: 0s - loss: 1.0150 - accuracy: 0.8895INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 1.0150 - accuracy: 0.8895 - val_loss: 0.5338 - val_accuracy: 0.9383
Epoch 52/200
313/313 [==============================] - ETA: 0s - loss: 0.9986 - accuracy: 0.8934INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 0.9986 - accuracy: 0.8934 - val_loss: 0.5229 - val_accuracy: 0.9429
Epoch 53/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9831 - accuracy: 0.8970 - val_loss: 0.5148 - val_accuracy: 0.9427
Epoch 54/200
313/313 [==============================] - 39s 125ms/step - loss: 0.9783 - accuracy: 0.9008 - val_loss: 0.5108 - val_accuracy: 0.9429
Epoch 55/200
313/313 [==============================] - ETA: 0s - loss: 0.9722 - accuracy: 0.9018INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 65s 209ms/step - loss: 0.9722 - accuracy: 0.9018 - val_loss: 0.5073 - val_accuracy: 0.9430
Epoch 56/200
313/313 [==============================] - 39s 125ms/step - loss: 0.9957 - accuracy: 0.8953 - val_loss: 5.9977 - val_accuracy: 0.1528
Epoch 57/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5731 - accuracy: 0.7130 - val_loss: 1.0766 - val_accuracy: 0.7941
Epoch 58/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5325 - accuracy: 0.7470 - val_loss: 0.9146 - val_accuracy: 0.8529
Epoch 59/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5171 - accuracy: 0.7580 - val_loss: 1.0302 - val_accuracy: 0.8152
Epoch 60/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5141 - accuracy: 0.7624 - val_loss: 0.9375 - val_accuracy: 0.8557
Epoch 61/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5066 - accuracy: 0.7696 - val_loss: 1.0323 - val_accuracy: 0.8193
Epoch 62/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5076 - accuracy: 0.7677 - val_loss: 1.1255 - val_accuracy: 0.7985
Epoch 63/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5054 - accuracy: 0.7695 - val_loss: 0.9050 - val_accuracy: 0.8627
Epoch 64/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5007 - accuracy: 0.7738 - val_loss: 0.9038 - val_accuracy: 0.8659
Epoch 65/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4998 - accuracy: 0.7750 - val_loss: 1.0054 - val_accuracy: 0.8408
Epoch 66/200
313/313 [==============================] - 39s 126ms/step - loss: 1.4832 - accuracy: 0.7816 - val_loss: 1.1450 - val_accuracy: 0.7915
Epoch 67/200
313/313 [==============================] - 39s 126ms/step - loss: 1.4919 - accuracy: 0.7783 - val_loss: 0.9424 - val_accuracy: 0.8571
Epoch 68/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4876 - accuracy: 0.7829 - val_loss: 1.0052 - val_accuracy: 0.8436
Epoch 69/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4831 - accuracy: 0.7851 - val_loss: 0.9546 - val_accuracy: 0.8528
Epoch 70/200
313/313 [==============================] - 39s 126ms/step - loss: 1.4796 - accuracy: 0.7867 - val_loss: 1.2661 - val_accuracy: 0.7605
Epoch 71/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4709 - accuracy: 0.7882 - val_loss: 1.0016 - val_accuracy: 0.8389
Epoch 72/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4664 - accuracy: 0.7910 - val_loss: 0.9696 - val_accuracy: 0.8547
Epoch 73/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4554 - accuracy: 0.7914 - val_loss: 0.9574 - val_accuracy: 0.8566
Epoch 74/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4531 - accuracy: 0.7956 - val_loss: 0.9053 - val_accuracy: 0.8683
Epoch 75/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4509 - accuracy: 0.7979 - val_loss: 0.9482 - val_accuracy: 0.8631
Epoch 76/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4411 - accuracy: 0.8026 - val_loss: 0.8697 - val_accuracy: 0.8863
Epoch 77/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4326 - accuracy: 0.8061 - val_loss: 0.9243 - val_accuracy: 0.8681
Epoch 78/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4256 - accuracy: 0.8067 - val_loss: 1.0596 - val_accuracy: 0.8333
Epoch 79/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4241 - accuracy: 0.8085 - val_loss: 0.8757 - val_accuracy: 0.8820
Epoch 80/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4093 - accuracy: 0.8129 - val_loss: 0.9546 - val_accuracy: 0.8585
Epoch 81/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4037 - accuracy: 0.8121 - val_loss: 0.9186 - val_accuracy: 0.8703
Epoch 82/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3897 - accuracy: 0.8173 - val_loss: 0.8719 - val_accuracy: 0.8842
Epoch 83/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3804 - accuracy: 0.8210 - val_loss: 0.9169 - val_accuracy: 0.8694
Epoch 84/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3716 - accuracy: 0.8227 - val_loss: 1.0167 - val_accuracy: 0.8385
Epoch 85/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3592 - accuracy: 0.8265 - val_loss: 0.8416 - val_accuracy: 0.8943
Epoch 86/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3525 - accuracy: 0.8307 - val_loss: 0.8073 - val_accuracy: 0.8990
Epoch 87/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3375 - accuracy: 0.8386 - val_loss: 0.8636 - val_accuracy: 0.8854
Epoch 88/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3281 - accuracy: 0.8368 - val_loss: 0.8529 - val_accuracy: 0.8889
Epoch 89/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3088 - accuracy: 0.8413 - val_loss: 0.8237 - val_accuracy: 0.8968
Epoch 90/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2946 - accuracy: 0.8446 - val_loss: 0.8320 - val_accuracy: 0.8943
Epoch 91/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2833 - accuracy: 0.8497 - val_loss: 0.7924 - val_accuracy: 0.9003
Epoch 92/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2650 - accuracy: 0.8561 - val_loss: 0.7828 - val_accuracy: 0.9024
Epoch 93/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2463 - accuracy: 0.8603 - val_loss: 0.7512 - val_accuracy: 0.9108
Epoch 94/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2313 - accuracy: 0.8637 - val_loss: 0.7769 - val_accuracy: 0.9054
Epoch 95/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2133 - accuracy: 0.8662 - val_loss: 0.7207 - val_accuracy: 0.9194
Epoch 96/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1922 - accuracy: 0.8719 - val_loss: 0.7302 - val_accuracy: 0.9152
Epoch 97/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1752 - accuracy: 0.8760 - val_loss: 0.7314 - val_accuracy: 0.9091
Epoch 98/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1552 - accuracy: 0.8828 - val_loss: 0.7048 - val_accuracy: 0.9210
Epoch 99/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1332 - accuracy: 0.8839 - val_loss: 0.7225 - val_accuracy: 0.9087
Epoch 100/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1199 - accuracy: 0.8849 - val_loss: 0.6742 - val_accuracy: 0.9221
Epoch 101/200
313/313 [==============================] - 39s 126ms/step - loss: 1.0987 - accuracy: 0.8894 - val_loss: 0.6447 - val_accuracy: 0.9235
Epoch 102/200
313/313 [==============================] - 39s 126ms/step - loss: 1.0774 - accuracy: 0.8949 - val_loss: 0.6278 - val_accuracy: 0.9311
Epoch 103/200
313/313 [==============================] - 39s 126ms/step - loss: 1.0604 - accuracy: 0.8989 - val_loss: 0.6261 - val_accuracy: 0.9291
Epoch 104/200
313/313 [==============================] - 39s 126ms/step - loss: 1.0402 - accuracy: 0.9064 - val_loss: 0.6026 - val_accuracy: 0.9344
Epoch 105/200
313/313 [==============================] - 39s 126ms/step - loss: 1.0205 - accuracy: 0.9076 - val_loss: 0.5954 - val_accuracy: 0.9346
Epoch 106/200
313/313 [==============================] - 39s 126ms/step - loss: 1.0089 - accuracy: 0.9107 - val_loss: 0.5827 - val_accuracy: 0.9374
Epoch 107/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9909 - accuracy: 0.9134 - val_loss: 0.5555 - val_accuracy: 0.9400
Epoch 108/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9780 - accuracy: 0.9132 - val_loss: 0.5514 - val_accuracy: 0.9400
Epoch 109/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9631 - accuracy: 0.9194 - val_loss: 0.5410 - val_accuracy: 0.9422
Epoch 110/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9513 - accuracy: 0.9219 - val_loss: 0.5409 - val_accuracy: 0.9430
Epoch 111/200
313/313 [==============================] - ETA: 0s - loss: 0.9393 - accuracy: 0.9216INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 212ms/step - loss: 0.9393 - accuracy: 0.9216 - val_loss: 0.5306 - val_accuracy: 0.9431
Epoch 112/200
313/313 [==============================] - ETA: 0s - loss: 0.9338 - accuracy: 0.9249INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 211ms/step - loss: 0.9338 - accuracy: 0.9249 - val_loss: 0.5197 - val_accuracy: 0.9458
Epoch 113/200
313/313 [==============================] - 40s 126ms/step - loss: 0.9237 - accuracy: 0.9278 - val_loss: 0.5179 - val_accuracy: 0.9441
Epoch 114/200
313/313 [==============================] - ETA: 0s - loss: 0.9180 - accuracy: 0.9295INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 212ms/step - loss: 0.9180 - accuracy: 0.9295 - val_loss: 0.5117 - val_accuracy: 0.9459
Epoch 115/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9142 - accuracy: 0.9293 - val_loss: 0.5089 - val_accuracy: 0.9454
Epoch 116/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9110 - accuracy: 0.9308 - val_loss: 0.5048 - val_accuracy: 0.9457
Epoch 117/200
313/313 [==============================] - ETA: 0s - loss: 0.9066 - accuracy: 0.9293INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix/assets
313/313 [==============================] - 66s 210ms/step - loss: 0.9066 - accuracy: 0.9293 - val_loss: 0.5057 - val_accuracy: 0.9461
Epoch 118/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9065 - accuracy: 0.9303 - val_loss: 0.5052 - val_accuracy: 0.9458
Epoch 119/200
313/313 [==============================] - 39s 126ms/step - loss: 0.9038 - accuracy: 0.9327 - val_loss: 0.5048 - val_accuracy: 0.9455
Epoch 120/200
313/313 [==============================] - 39s 125ms/step - loss: 1.0264 - accuracy: 0.8869 - val_loss: 11.8661 - val_accuracy: 0.0972
Epoch 121/200
313/313 [==============================] - 39s 125ms/step - loss: 1.6368 - accuracy: 0.7214 - val_loss: 1.2073 - val_accuracy: 0.7793
Epoch 122/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5753 - accuracy: 0.7644 - val_loss: 1.1046 - val_accuracy: 0.8255
Epoch 123/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5656 - accuracy: 0.7760 - val_loss: 1.0350 - val_accuracy: 0.8460
Epoch 124/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5541 - accuracy: 0.7812 - val_loss: 1.1917 - val_accuracy: 0.8067
Epoch 125/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5493 - accuracy: 0.7837 - val_loss: 1.0039 - val_accuracy: 0.8569
Epoch 126/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5514 - accuracy: 0.7846 - val_loss: 1.1006 - val_accuracy: 0.8277
Epoch 127/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5468 - accuracy: 0.7886 - val_loss: 1.0880 - val_accuracy: 0.8401
Epoch 128/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5507 - accuracy: 0.7873 - val_loss: 0.9995 - val_accuracy: 0.8667
Epoch 129/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5375 - accuracy: 0.7967 - val_loss: 1.0588 - val_accuracy: 0.8437
Epoch 130/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5352 - accuracy: 0.7937 - val_loss: 1.0517 - val_accuracy: 0.8515
Epoch 131/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5374 - accuracy: 0.7903 - val_loss: 1.0275 - val_accuracy: 0.8553
Epoch 132/200
313/313 [==============================] - 39s 126ms/step - loss: 1.5313 - accuracy: 0.7953 - val_loss: 1.1623 - val_accuracy: 0.8144
Epoch 133/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5401 - accuracy: 0.7951 - val_loss: 1.0636 - val_accuracy: 0.8449
Epoch 134/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5395 - accuracy: 0.7938 - val_loss: 1.0476 - val_accuracy: 0.8473
Epoch 135/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5275 - accuracy: 0.7987 - val_loss: 1.1208 - val_accuracy: 0.8325
Epoch 136/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5321 - accuracy: 0.7962 - val_loss: 0.9670 - val_accuracy: 0.8812
Epoch 137/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5245 - accuracy: 0.7990 - val_loss: 0.9952 - val_accuracy: 0.8693
Epoch 138/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5285 - accuracy: 0.7987 - val_loss: 1.1263 - val_accuracy: 0.8337
Epoch 139/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5258 - accuracy: 0.7982 - val_loss: 1.1517 - val_accuracy: 0.8209
Epoch 140/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5150 - accuracy: 0.8017 - val_loss: 1.0669 - val_accuracy: 0.8508
Epoch 141/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5191 - accuracy: 0.8017 - val_loss: 1.0124 - val_accuracy: 0.8643
Epoch 142/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5182 - accuracy: 0.8012 - val_loss: 0.9927 - val_accuracy: 0.8780
Epoch 143/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5169 - accuracy: 0.8004 - val_loss: 1.0680 - val_accuracy: 0.8555
Epoch 144/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5119 - accuracy: 0.8069 - val_loss: 1.0130 - val_accuracy: 0.8716
Epoch 145/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5162 - accuracy: 0.8034 - val_loss: 1.0978 - val_accuracy: 0.8368
Epoch 146/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5070 - accuracy: 0.8057 - val_loss: 0.9741 - val_accuracy: 0.8829
Epoch 147/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5162 - accuracy: 0.8039 - val_loss: 1.0389 - val_accuracy: 0.8585
Epoch 148/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5174 - accuracy: 0.8056 - val_loss: 1.0562 - val_accuracy: 0.8567
Epoch 149/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5093 - accuracy: 0.8092 - val_loss: 1.0122 - val_accuracy: 0.8713
Epoch 150/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5098 - accuracy: 0.8082 - val_loss: 1.1118 - val_accuracy: 0.8458
Epoch 151/200
313/313 [==============================] - 39s 125ms/step - loss: 1.5002 - accuracy: 0.8115 - val_loss: 1.0901 - val_accuracy: 0.8471
Epoch 152/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4988 - accuracy: 0.8140 - val_loss: 0.9402 - val_accuracy: 0.8899
Epoch 153/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4932 - accuracy: 0.8144 - val_loss: 0.9630 - val_accuracy: 0.8787
Epoch 154/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4866 - accuracy: 0.8120 - val_loss: 1.0751 - val_accuracy: 0.8488
Epoch 155/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4795 - accuracy: 0.8156 - val_loss: 0.9625 - val_accuracy: 0.8792
Epoch 156/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4794 - accuracy: 0.8151 - val_loss: 0.9714 - val_accuracy: 0.8757
Epoch 157/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4756 - accuracy: 0.8171 - val_loss: 1.2053 - val_accuracy: 0.8281
Epoch 158/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4779 - accuracy: 0.8163 - val_loss: 0.9648 - val_accuracy: 0.8780
Epoch 159/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4753 - accuracy: 0.8180 - val_loss: 0.9233 - val_accuracy: 0.8973
Epoch 160/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4693 - accuracy: 0.8187 - val_loss: 0.9314 - val_accuracy: 0.8905
Epoch 161/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4658 - accuracy: 0.8188 - val_loss: 0.9956 - val_accuracy: 0.8730
Epoch 162/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4564 - accuracy: 0.8243 - val_loss: 0.9960 - val_accuracy: 0.8714
Epoch 163/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4516 - accuracy: 0.8227 - val_loss: 0.9315 - val_accuracy: 0.8870
Epoch 164/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4499 - accuracy: 0.8262 - val_loss: 0.8808 - val_accuracy: 0.9043
Epoch 165/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4431 - accuracy: 0.8276 - val_loss: 0.9837 - val_accuracy: 0.8746
Epoch 166/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4379 - accuracy: 0.8296 - val_loss: 0.9285 - val_accuracy: 0.8919
Epoch 167/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4397 - accuracy: 0.8253 - val_loss: 0.9421 - val_accuracy: 0.8868
Epoch 168/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4360 - accuracy: 0.8278 - val_loss: 0.9893 - val_accuracy: 0.8682
Epoch 169/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4315 - accuracy: 0.8316 - val_loss: 0.9554 - val_accuracy: 0.8809
Epoch 170/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4272 - accuracy: 0.8319 - val_loss: 1.0309 - val_accuracy: 0.8628
Epoch 171/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4155 - accuracy: 0.8385 - val_loss: 0.9403 - val_accuracy: 0.8849
Epoch 172/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4111 - accuracy: 0.8367 - val_loss: 0.9640 - val_accuracy: 0.8773
Epoch 173/200
313/313 [==============================] - 39s 125ms/step - loss: 1.4094 - accuracy: 0.8374 - val_loss: 0.9603 - val_accuracy: 0.8788
Epoch 174/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3939 - accuracy: 0.8424 - val_loss: 0.8803 - val_accuracy: 0.9034
Epoch 175/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3909 - accuracy: 0.8414 - val_loss: 0.9271 - val_accuracy: 0.8868
Epoch 176/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3832 - accuracy: 0.8423 - val_loss: 0.9236 - val_accuracy: 0.8885
Epoch 177/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3846 - accuracy: 0.8422 - val_loss: 0.9439 - val_accuracy: 0.8818
Epoch 178/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3715 - accuracy: 0.8457 - val_loss: 0.9022 - val_accuracy: 0.8920
Epoch 179/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3699 - accuracy: 0.8450 - val_loss: 0.8707 - val_accuracy: 0.9044
Epoch 180/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3655 - accuracy: 0.8485 - val_loss: 0.9272 - val_accuracy: 0.8846
Epoch 181/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3528 - accuracy: 0.8491 - val_loss: 0.8472 - val_accuracy: 0.9081
Epoch 182/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3493 - accuracy: 0.8538 - val_loss: 0.8468 - val_accuracy: 0.9050
Epoch 183/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3384 - accuracy: 0.8551 - val_loss: 0.8864 - val_accuracy: 0.8979
Epoch 184/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3318 - accuracy: 0.8559 - val_loss: 0.8358 - val_accuracy: 0.9108
Epoch 185/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3223 - accuracy: 0.8595 - val_loss: 0.8582 - val_accuracy: 0.9051
Epoch 186/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3138 - accuracy: 0.8611 - val_loss: 0.8947 - val_accuracy: 0.8916
Epoch 187/200
313/313 [==============================] - 39s 125ms/step - loss: 1.3102 - accuracy: 0.8611 - val_loss: 0.8267 - val_accuracy: 0.9104
Epoch 188/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2965 - accuracy: 0.8643 - val_loss: 0.8866 - val_accuracy: 0.8932
Epoch 189/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2893 - accuracy: 0.8674 - val_loss: 0.8033 - val_accuracy: 0.9133
Epoch 190/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2883 - accuracy: 0.8645 - val_loss: 0.8321 - val_accuracy: 0.9057
Epoch 191/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2738 - accuracy: 0.8698 - val_loss: 0.8157 - val_accuracy: 0.9088
Epoch 192/200
313/313 [==============================] - 39s 126ms/step - loss: 1.2665 - accuracy: 0.8706 - val_loss: 0.8057 - val_accuracy: 0.9120
Epoch 193/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2487 - accuracy: 0.8760 - val_loss: 0.7715 - val_accuracy: 0.9184
Epoch 194/200
313/313 [==============================] - 39s 126ms/step - loss: 1.2394 - accuracy: 0.8750 - val_loss: 0.7754 - val_accuracy: 0.9166
Epoch 195/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2349 - accuracy: 0.8760 - val_loss: 0.7755 - val_accuracy: 0.9104
Epoch 196/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2252 - accuracy: 0.8790 - val_loss: 0.8180 - val_accuracy: 0.9070
Epoch 197/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2115 - accuracy: 0.8824 - val_loss: 0.7537 - val_accuracy: 0.9172
Epoch 198/200
313/313 [==============================] - 39s 125ms/step - loss: 1.2015 - accuracy: 0.8827 - val_loss: 0.7592 - val_accuracy: 0.9151
Epoch 199/200
313/313 [==============================] - 39s 125ms/step - loss: 1.1906 - accuracy: 0.8860 - val_loss: 0.7727 - val_accuracy: 0.9148
Epoch 200/200
313/313 [==============================] - 39s 126ms/step - loss: 1.1849 - accuracy: 0.8838 - val_loss: 0.7476 - val_accuracy: 0.9193
Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
display(results)
fig.show()
Epochs 200 Batch Size 128 Model Name SEWRN_28_10_Fixed_Cutmix Remarks Model Path /content/drive/MyDrive/Data/DELE CA1/CIFAR10/S... Train Loss 0.9066 Test Loss 0.505669 Train Acc 0.92925 Test Acc 0.9461 [Train - Test] Acc -0.01685 dtype: object
It can be seen that WRN with SE and Cutmix does very well, managing to avoid overfitting, while achieving a high validation accuracy of 94.6%
As an additional improvement, I try changing the activation function with the Mish activation. The Mish activation is a proposed activation function defined as $$ f(z) = z * \tanh (softplus (z)) $$
It has been shown to provide improvements over ReLU over many benchmark datasets, at some cost the time taken for training
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
train_cutmix_ds, val_ds = set_up_cutmix()
model = build_wideresnet(optimizer, name="SEWRN_28_10_Fixed_Cutmix_Mish", layer=SEWideResNetLayer, activation=mish)
results, fig = evaluator.evaluate_model(model, training_data=train_cutmix_ds, validation_data=val_ds,callbacks=callbacks)
evaluator.save_history()
Model: "SEWRN_28_10_Fixed_Cutmix_Mish"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32, 32, 3)] 0
normalization (Normalizatio (None, 32, 32, 3) 7
n)
conv2d (Conv2D) (None, 32, 32, 16) 432
wide_resnet_block (WideResn (None, 32, 32, 160) 1732544
etBlock)
wide_resnet_block_1 (WideRe (None, 16, 16, 320) 7024000
snetBlock)
wide_resnet_block_2 (WideRe (None, 8, 8, 640) 28076800
snetBlock)
batch_normalization_24 (Bat (None, 8, 8, 640) 2560
chNormalization)
activation_12 (Activation) (None, 8, 8, 640) 0
global_average_pooling2d_12 (None, 640) 0
(GlobalAveragePooling2D)
dense_24 (Dense) (None, 10) 6410
=================================================================
Total params: 36,842,753
Trainable params: 36,824,794
Non-trainable params: 17,959
_________________________________________________________________
None
Training SEWRN_28_10_Fixed_Cutmix_Mish
Epoch 1/200
313/313 [==============================] - ETA: 0s - loss: 16.9268 - accuracy: 0.1750INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 79s 233ms/step - loss: 16.9268 - accuracy: 0.1750 - val_loss: 16.1318 - val_accuracy: 0.2007
Epoch 2/200
313/313 [==============================] - ETA: 0s - loss: 15.5663 - accuracy: 0.2005INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 15.5663 - accuracy: 0.2005 - val_loss: 14.8881 - val_accuracy: 0.2295
Epoch 3/200
313/313 [==============================] - ETA: 0s - loss: 14.4182 - accuracy: 0.2085INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 14.4182 - accuracy: 0.2085 - val_loss: 13.8085 - val_accuracy: 0.2493
Epoch 4/200
313/313 [==============================] - ETA: 0s - loss: 13.4085 - accuracy: 0.2193INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 13.4085 - accuracy: 0.2193 - val_loss: 12.8504 - val_accuracy: 0.2626
Epoch 5/200
313/313 [==============================] - ETA: 0s - loss: 12.5199 - accuracy: 0.2242INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 12.5199 - accuracy: 0.2242 - val_loss: 12.0107 - val_accuracy: 0.2727
Epoch 6/200
313/313 [==============================] - ETA: 0s - loss: 11.7313 - accuracy: 0.2311INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 11.7313 - accuracy: 0.2311 - val_loss: 11.2620 - val_accuracy: 0.2838
Epoch 7/200
313/313 [==============================] - ETA: 0s - loss: 11.0313 - accuracy: 0.2340INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 11.0313 - accuracy: 0.2340 - val_loss: 10.6045 - val_accuracy: 0.2845
Epoch 8/200
313/313 [==============================] - ETA: 0s - loss: 10.4038 - accuracy: 0.2458INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 10.4038 - accuracy: 0.2458 - val_loss: 10.0016 - val_accuracy: 0.2860
Epoch 9/200
313/313 [==============================] - ETA: 0s - loss: 9.8460 - accuracy: 0.2510INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 9.8460 - accuracy: 0.2510 - val_loss: 9.4716 - val_accuracy: 0.3007
Epoch 10/200
313/313 [==============================] - 45s 145ms/step - loss: 9.3407 - accuracy: 0.2580 - val_loss: 8.9657 - val_accuracy: 0.2997
Epoch 11/200
313/313 [==============================] - ETA: 0s - loss: 8.8813 - accuracy: 0.2643INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 231ms/step - loss: 8.8813 - accuracy: 0.2643 - val_loss: 8.5181 - val_accuracy: 0.3129
Epoch 12/200
313/313 [==============================] - ETA: 0s - loss: 8.4714 - accuracy: 0.2741INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 8.4714 - accuracy: 0.2741 - val_loss: 8.1419 - val_accuracy: 0.3153
Epoch 13/200
313/313 [==============================] - ETA: 0s - loss: 8.0955 - accuracy: 0.2865INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 8.0955 - accuracy: 0.2865 - val_loss: 7.7811 - val_accuracy: 0.3330
Epoch 14/200
313/313 [==============================] - 45s 145ms/step - loss: 7.7601 - accuracy: 0.3001 - val_loss: 7.4598 - val_accuracy: 0.3291
Epoch 15/200
313/313 [==============================] - ETA: 0s - loss: 7.4417 - accuracy: 0.3184INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 232ms/step - loss: 7.4417 - accuracy: 0.3184 - val_loss: 7.0658 - val_accuracy: 0.3748
Epoch 16/200
313/313 [==============================] - ETA: 0s - loss: 7.1488 - accuracy: 0.3338INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 7.1488 - accuracy: 0.3338 - val_loss: 6.7647 - val_accuracy: 0.3841
Epoch 17/200
313/313 [==============================] - ETA: 0s - loss: 6.8935 - accuracy: 0.3460INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 6.8935 - accuracy: 0.3460 - val_loss: 6.4964 - val_accuracy: 0.3866
Epoch 18/200
313/313 [==============================] - ETA: 0s - loss: 6.6565 - accuracy: 0.3622INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 231ms/step - loss: 6.6565 - accuracy: 0.3622 - val_loss: 6.2162 - val_accuracy: 0.4426
Epoch 19/200
313/313 [==============================] - ETA: 0s - loss: 6.4417 - accuracy: 0.3803INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 6.4417 - accuracy: 0.3803 - val_loss: 6.0005 - val_accuracy: 0.4684
Epoch 20/200
313/313 [==============================] - ETA: 0s - loss: 6.2489 - accuracy: 0.3900INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 6.2489 - accuracy: 0.3900 - val_loss: 5.7670 - val_accuracy: 0.4930
Epoch 21/200
313/313 [==============================] - ETA: 0s - loss: 6.0744 - accuracy: 0.4047INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 6.0744 - accuracy: 0.4047 - val_loss: 5.5819 - val_accuracy: 0.5110
Epoch 22/200
313/313 [==============================] - 45s 145ms/step - loss: 5.9126 - accuracy: 0.4205 - val_loss: 5.4426 - val_accuracy: 0.5072
Epoch 23/200
313/313 [==============================] - ETA: 0s - loss: 5.7665 - accuracy: 0.4339INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 5.7665 - accuracy: 0.4339 - val_loss: 5.2278 - val_accuracy: 0.5566
Epoch 24/200
313/313 [==============================] - ETA: 0s - loss: 5.6327 - accuracy: 0.4496INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 232ms/step - loss: 5.6327 - accuracy: 0.4496 - val_loss: 5.1163 - val_accuracy: 0.5620
Epoch 25/200
313/313 [==============================] - ETA: 0s - loss: 5.5040 - accuracy: 0.4658INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 232ms/step - loss: 5.5040 - accuracy: 0.4658 - val_loss: 4.9721 - val_accuracy: 0.5939
Epoch 26/200
313/313 [==============================] - ETA: 0s - loss: 5.4004 - accuracy: 0.4760INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 5.4004 - accuracy: 0.4760 - val_loss: 4.8495 - val_accuracy: 0.6056
Epoch 27/200
313/313 [==============================] - ETA: 0s - loss: 5.3000 - accuracy: 0.4911INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 5.3000 - accuracy: 0.4911 - val_loss: 4.7372 - val_accuracy: 0.6173
Epoch 28/200
313/313 [==============================] - ETA: 0s - loss: 5.2103 - accuracy: 0.5010INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 231ms/step - loss: 5.2103 - accuracy: 0.5010 - val_loss: 4.6268 - val_accuracy: 0.6402
Epoch 29/200
313/313 [==============================] - 45s 144ms/step - loss: 5.1363 - accuracy: 0.5046 - val_loss: 4.5494 - val_accuracy: 0.6402
Epoch 30/200
313/313 [==============================] - ETA: 0s - loss: 5.0636 - accuracy: 0.5171INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 5.0636 - accuracy: 0.5171 - val_loss: 4.4715 - val_accuracy: 0.6432
Epoch 31/200
313/313 [==============================] - ETA: 0s - loss: 5.0064 - accuracy: 0.5202INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 231ms/step - loss: 5.0064 - accuracy: 0.5202 - val_loss: 4.3804 - val_accuracy: 0.6664
Epoch 32/200
313/313 [==============================] - 45s 144ms/step - loss: 4.9450 - accuracy: 0.5296 - val_loss: 4.3567 - val_accuracy: 0.6516
Epoch 33/200
313/313 [==============================] - ETA: 0s - loss: 4.8958 - accuracy: 0.5351INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 4.8958 - accuracy: 0.5351 - val_loss: 4.2884 - val_accuracy: 0.6699
Epoch 34/200
313/313 [==============================] - ETA: 0s - loss: 4.8518 - accuracy: 0.5422INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 231ms/step - loss: 4.8518 - accuracy: 0.5422 - val_loss: 4.2204 - val_accuracy: 0.6902
Epoch 35/200
313/313 [==============================] - ETA: 0s - loss: 4.8204 - accuracy: 0.5458INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 72s 232ms/step - loss: 4.8204 - accuracy: 0.5458 - val_loss: 4.1900 - val_accuracy: 0.6944
Epoch 36/200
313/313 [==============================] - 45s 144ms/step - loss: 4.7747 - accuracy: 0.5532 - val_loss: 4.1461 - val_accuracy: 0.6927
Epoch 37/200
313/313 [==============================] - ETA: 0s - loss: 4.7458 - accuracy: 0.5620INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 4.7458 - accuracy: 0.5620 - val_loss: 4.1208 - val_accuracy: 0.6952
Epoch 38/200
313/313 [==============================] - ETA: 0s - loss: 4.7291 - accuracy: 0.5591INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 4.7291 - accuracy: 0.5591 - val_loss: 4.0871 - val_accuracy: 0.7046
Epoch 39/200
313/313 [==============================] - 45s 145ms/step - loss: 4.7031 - accuracy: 0.5642 - val_loss: 4.0708 - val_accuracy: 0.7025
Epoch 40/200
313/313 [==============================] - ETA: 0s - loss: 4.6865 - accuracy: 0.5660INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 4.6865 - accuracy: 0.5660 - val_loss: 4.0453 - val_accuracy: 0.7090
Epoch 41/200
313/313 [==============================] - ETA: 0s - loss: 4.6745 - accuracy: 0.5640INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 4.6745 - accuracy: 0.5640 - val_loss: 4.0260 - val_accuracy: 0.7189
Epoch 42/200
313/313 [==============================] - ETA: 0s - loss: 4.6664 - accuracy: 0.5724INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 4.6664 - accuracy: 0.5724 - val_loss: 4.0167 - val_accuracy: 0.7195
Epoch 43/200
313/313 [==============================] - ETA: 0s - loss: 4.6504 - accuracy: 0.5755INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 234ms/step - loss: 4.6504 - accuracy: 0.5755 - val_loss: 3.9998 - val_accuracy: 0.7221
Epoch 44/200
313/313 [==============================] - ETA: 0s - loss: 4.6492 - accuracy: 0.5768INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 4.6492 - accuracy: 0.5768 - val_loss: 3.9899 - val_accuracy: 0.7251
Epoch 45/200
313/313 [==============================] - ETA: 0s - loss: 4.6412 - accuracy: 0.5812INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 4.6412 - accuracy: 0.5812 - val_loss: 3.9821 - val_accuracy: 0.7280
Epoch 46/200
313/313 [==============================] - 45s 145ms/step - loss: 4.6368 - accuracy: 0.5803 - val_loss: 3.9806 - val_accuracy: 0.7276
Epoch 47/200
313/313 [==============================] - ETA: 0s - loss: 4.6345 - accuracy: 0.5812INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 4.6345 - accuracy: 0.5812 - val_loss: 3.9799 - val_accuracy: 0.7287
Epoch 48/200
313/313 [==============================] - 45s 145ms/step - loss: 4.7946 - accuracy: 0.4593 - val_loss: 4.4873 - val_accuracy: 0.3979
Epoch 49/200
313/313 [==============================] - 45s 145ms/step - loss: 4.2588 - accuracy: 0.3950 - val_loss: 3.5582 - val_accuracy: 0.4724
Epoch 50/200
313/313 [==============================] - 45s 145ms/step - loss: 3.5225 - accuracy: 0.4549 - val_loss: 2.7892 - val_accuracy: 0.5644
Epoch 51/200
313/313 [==============================] - 45s 145ms/step - loss: 3.0110 - accuracy: 0.4858 - val_loss: 2.5744 - val_accuracy: 0.5058
Epoch 52/200
313/313 [==============================] - 45s 145ms/step - loss: 2.6251 - accuracy: 0.5081 - val_loss: 1.9572 - val_accuracy: 0.6261
Epoch 53/200
313/313 [==============================] - 45s 145ms/step - loss: 2.3632 - accuracy: 0.5243 - val_loss: 1.7947 - val_accuracy: 0.6019
Epoch 54/200
313/313 [==============================] - 45s 145ms/step - loss: 2.1509 - accuracy: 0.5387 - val_loss: 1.6438 - val_accuracy: 0.5993
Epoch 55/200
313/313 [==============================] - 45s 145ms/step - loss: 2.0069 - accuracy: 0.5494 - val_loss: 1.3677 - val_accuracy: 0.6768
Epoch 56/200
313/313 [==============================] - 45s 145ms/step - loss: 1.8874 - accuracy: 0.5630 - val_loss: 1.2570 - val_accuracy: 0.6851
Epoch 57/200
313/313 [==============================] - 45s 145ms/step - loss: 1.8118 - accuracy: 0.5668 - val_loss: 1.1917 - val_accuracy: 0.6931
Epoch 58/200
313/313 [==============================] - 45s 144ms/step - loss: 1.7473 - accuracy: 0.5767 - val_loss: 1.2100 - val_accuracy: 0.6847
Epoch 59/200
313/313 [==============================] - 45s 144ms/step - loss: 1.6983 - accuracy: 0.5854 - val_loss: 1.1450 - val_accuracy: 0.6874
Epoch 60/200
313/313 [==============================] - 45s 144ms/step - loss: 1.6607 - accuracy: 0.5930 - val_loss: 1.0598 - val_accuracy: 0.7107
Epoch 61/200
313/313 [==============================] - ETA: 0s - loss: 1.6243 - accuracy: 0.6020INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 1.6243 - accuracy: 0.6020 - val_loss: 0.9495 - val_accuracy: 0.7495
Epoch 62/200
313/313 [==============================] - ETA: 0s - loss: 1.6005 - accuracy: 0.6051INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 232ms/step - loss: 1.6005 - accuracy: 0.6051 - val_loss: 0.9223 - val_accuracy: 0.7588
Epoch 63/200
313/313 [==============================] - 45s 144ms/step - loss: 1.5852 - accuracy: 0.6105 - val_loss: 1.0569 - val_accuracy: 0.7002
Epoch 64/200
313/313 [==============================] - 45s 144ms/step - loss: 1.5659 - accuracy: 0.6166 - val_loss: 0.9231 - val_accuracy: 0.7558
Epoch 65/200
313/313 [==============================] - 45s 144ms/step - loss: 1.5512 - accuracy: 0.6238 - val_loss: 0.9414 - val_accuracy: 0.7485
Epoch 66/200
313/313 [==============================] - 45s 144ms/step - loss: 1.5437 - accuracy: 0.6252 - val_loss: 0.9372 - val_accuracy: 0.7453
Epoch 67/200
313/313 [==============================] - 45s 144ms/step - loss: 1.5316 - accuracy: 0.6301 - val_loss: 0.9884 - val_accuracy: 0.7258
Epoch 68/200
313/313 [==============================] - 45s 144ms/step - loss: 1.5268 - accuracy: 0.6320 - val_loss: 0.9237 - val_accuracy: 0.7549
Epoch 69/200
313/313 [==============================] - ETA: 0s - loss: 1.5180 - accuracy: 0.6398INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish/assets
313/313 [==============================] - 73s 233ms/step - loss: 1.5180 - accuracy: 0.6398 - val_loss: 0.8875 - val_accuracy: 0.7739
Epoch 70/200
313/313 [==============================] - 45s 145ms/step - loss: 1.5134 - accuracy: 0.6411 - val_loss: 0.9060 - val_accuracy: 0.7628
Epoch 71/200
313/313 [==============================] - 45s 145ms/step - loss: 1.5073 - accuracy: 0.6464 - val_loss: 0.8691 - val_accuracy: 0.7650
Epoch 72/200
228/313 [====================>.........] - ETA: 11s - loss: 1.5018 - accuracy: 0.6437
Halting Training
Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix_Mish
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
It seems that Mish activation does not help our model at all, so we stick to ReLU activation.
By following the procedure in Snapshot Ensembles: Train 1, get M for free, I can make use of the model weights before each warm restart for an ensemble model. The logic is that the warm restart scheduling means that each model snapshot before the warm restart will be at a different local minima, and so by taking a snapshot, we can effectively ensemble many diverse models at no additional training cost.
Now normally in deployment, an ensemble deep learning model would not be used since it is computationally expensive during inference, but as an interesting exercise to see how much I can improve the model, I think it's worth a try.
To do this I will,
print("Final Learning Rate:", LR)
Final Learning Rate: 0.05
class Snapshot(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs = {}):
if logs["val_accuracy"] > 0.9: # save those epochs with potential
self.model.save(f"/content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_{epoch}_{logs['val_accuracy']}")
callbacks = [
Snapshot()
]
lr_scheduler = tf.keras.optimizers.schedules.CosineDecayRestarts(
LR,
steps_per_epoch * 10 # found to work well for CIFAR10 with shorter training time
)
optimizer = SGD(
learning_rate = lr_scheduler,
momentum = momentum
)
train_cutmix_ds, val_ds = set_up_cutmix()
model = build_wideresnet(optimizer, name="SEWRN_Cutmix_Ensemble", layer=SEWideResNetLayer)
results, fig = evaluator.evaluate_model(model, training_data=train_cutmix_ds, validation_data=val_ds,callbacks=callbacks)
evaluator.save_history()
Model: "SEWRN_Cutmix_Ensemble"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32, 32, 3)] 0
normalization (Normalizatio (None, 32, 32, 3) 7
n)
conv2d (Conv2D) (None, 32, 32, 16) 432
wide_resnet_block (WideResn (None, 32, 32, 160) 1732544
etBlock)
wide_resnet_block_1 (WideRe (None, 16, 16, 320) 7024000
snetBlock)
wide_resnet_block_2 (WideRe (None, 8, 8, 640) 28076800
snetBlock)
batch_normalization_24 (Bat (None, 8, 8, 640) 2560
chNormalization)
activation_12 (Activation) (None, 8, 8, 640) 0
global_average_pooling2d_12 (None, 640) 0
(GlobalAveragePooling2D)
dense_24 (Dense) (None, 10) 6410
=================================================================
Total params: 36,842,753
Trainable params: 36,824,794
Non-trainable params: 17,959
_________________________________________________________________
None
Training SEWRN_Cutmix_Ensemble
Epoch 1/200
6/313 [..............................] - ETA: 1:05 - loss: 17.6617 - accuracy: 0.1263WARNING:tensorflow:Callback method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0956s vs `on_train_batch_end` time: 0.1152s). Check your callbacks.
313/313 [==============================] - 132s 229ms/step - loss: 15.5086 - accuracy: 0.1958 - val_loss: 13.4175 - val_accuracy: 0.2304
Epoch 2/200
313/313 [==============================] - 71s 225ms/step - loss: 11.9148 - accuracy: 0.2417 - val_loss: 10.4517 - val_accuracy: 0.2452
Epoch 3/200
313/313 [==============================] - 71s 226ms/step - loss: 9.3953 - accuracy: 0.2815 - val_loss: 8.3153 - val_accuracy: 0.2936
Epoch 4/200
313/313 [==============================] - 71s 226ms/step - loss: 7.6705 - accuracy: 0.3328 - val_loss: 7.1484 - val_accuracy: 0.2179
Epoch 5/200
313/313 [==============================] - 71s 226ms/step - loss: 6.4745 - accuracy: 0.4040 - val_loss: 6.1222 - val_accuracy: 0.3576
Epoch 6/200
313/313 [==============================] - 71s 227ms/step - loss: 5.6602 - accuracy: 0.4827 - val_loss: 5.0981 - val_accuracy: 0.5130
Epoch 7/200
313/313 [==============================] - 71s 226ms/step - loss: 5.1349 - accuracy: 0.5411 - val_loss: 4.5283 - val_accuracy: 0.6156
Epoch 8/200
313/313 [==============================] - 71s 226ms/step - loss: 4.8422 - accuracy: 0.5734 - val_loss: 4.2587 - val_accuracy: 0.6607
Epoch 9/200
313/313 [==============================] - 71s 226ms/step - loss: 4.6691 - accuracy: 0.6086 - val_loss: 4.0210 - val_accuracy: 0.7267
Epoch 10/200
313/313 [==============================] - 70s 225ms/step - loss: 4.6016 - accuracy: 0.6268 - val_loss: 3.9302 - val_accuracy: 0.7634
Epoch 11/200
313/313 [==============================] - 71s 225ms/step - loss: 4.4125 - accuracy: 0.5164 - val_loss: 3.9840 - val_accuracy: 0.4799
Epoch 12/200
313/313 [==============================] - 71s 226ms/step - loss: 3.6092 - accuracy: 0.5634 - val_loss: 3.0722 - val_accuracy: 0.5875
Epoch 13/200
313/313 [==============================] - 71s 225ms/step - loss: 3.0368 - accuracy: 0.5944 - val_loss: 2.5405 - val_accuracy: 0.6183
Epoch 14/200
313/313 [==============================] - 71s 225ms/step - loss: 2.6323 - accuracy: 0.6130 - val_loss: 2.4330 - val_accuracy: 0.5403
Epoch 15/200
313/313 [==============================] - 71s 225ms/step - loss: 2.3301 - accuracy: 0.6419 - val_loss: 1.6687 - val_accuracy: 0.7468
Epoch 16/200
313/313 [==============================] - 71s 225ms/step - loss: 2.1089 - accuracy: 0.6620 - val_loss: 1.6508 - val_accuracy: 0.6883
Epoch 17/200
313/313 [==============================] - 71s 226ms/step - loss: 1.9481 - accuracy: 0.6730 - val_loss: 1.2834 - val_accuracy: 0.7839
Epoch 18/200
313/313 [==============================] - 71s 226ms/step - loss: 1.8210 - accuracy: 0.6943 - val_loss: 1.1403 - val_accuracy: 0.8188
Epoch 19/200
313/313 [==============================] - 71s 225ms/step - loss: 1.7210 - accuracy: 0.7061 - val_loss: 1.1275 - val_accuracy: 0.8007
Epoch 20/200
313/313 [==============================] - 71s 226ms/step - loss: 1.6450 - accuracy: 0.7210 - val_loss: 1.0217 - val_accuracy: 0.8291
Epoch 21/200
313/313 [==============================] - 71s 225ms/step - loss: 1.5685 - accuracy: 0.7365 - val_loss: 0.9221 - val_accuracy: 0.8474
Epoch 22/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5026 - accuracy: 0.7546 - val_loss: 0.8410 - val_accuracy: 0.8666
Epoch 23/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4447 - accuracy: 0.7703 - val_loss: 0.8348 - val_accuracy: 0.8621
Epoch 24/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3876 - accuracy: 0.7860 - val_loss: 0.7435 - val_accuracy: 0.8876
Epoch 25/200
313/313 [==============================] - ETA: 0s - loss: 1.3285 - accuracy: 0.8054INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_24_0.9043999910354614/assets
313/313 [==============================] - 97s 309ms/step - loss: 1.3285 - accuracy: 0.8054 - val_loss: 0.6984 - val_accuracy: 0.9044
Epoch 26/200
313/313 [==============================] - 71s 226ms/step - loss: 1.2816 - accuracy: 0.8180 - val_loss: 0.7053 - val_accuracy: 0.8961
Epoch 27/200
313/313 [==============================] - ETA: 0s - loss: 1.2397 - accuracy: 0.8316INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_26_0.9176999926567078/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.2397 - accuracy: 0.8316 - val_loss: 0.6406 - val_accuracy: 0.9177
Epoch 28/200
313/313 [==============================] - ETA: 0s - loss: 1.2029 - accuracy: 0.8479INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_27_0.9229999780654907/assets
313/313 [==============================] - 97s 310ms/step - loss: 1.2029 - accuracy: 0.8479 - val_loss: 0.6228 - val_accuracy: 0.9230
Epoch 29/200
313/313 [==============================] - ETA: 0s - loss: 1.1812 - accuracy: 0.8544INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_28_0.9246000051498413/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.1812 - accuracy: 0.8544 - val_loss: 0.6171 - val_accuracy: 0.9246
Epoch 30/200
313/313 [==============================] - ETA: 0s - loss: 1.1740 - accuracy: 0.8573INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_29_0.9243000149726868/assets
313/313 [==============================] - 97s 310ms/step - loss: 1.1740 - accuracy: 0.8573 - val_loss: 0.6145 - val_accuracy: 0.9243
Epoch 31/200
313/313 [==============================] - 71s 226ms/step - loss: 1.6368 - accuracy: 0.6806 - val_loss: 1.1679 - val_accuracy: 0.7367
Epoch 32/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5933 - accuracy: 0.7031 - val_loss: 1.0202 - val_accuracy: 0.7997
Epoch 33/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5696 - accuracy: 0.7099 - val_loss: 1.0442 - val_accuracy: 0.7814
Epoch 34/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5377 - accuracy: 0.7219 - val_loss: 1.0888 - val_accuracy: 0.7675
Epoch 35/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5230 - accuracy: 0.7283 - val_loss: 0.9346 - val_accuracy: 0.8311
Epoch 36/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5109 - accuracy: 0.7329 - val_loss: 1.0456 - val_accuracy: 0.7833
Epoch 37/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4908 - accuracy: 0.7451 - val_loss: 0.9819 - val_accuracy: 0.8145
Epoch 38/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4817 - accuracy: 0.7493 - val_loss: 0.8872 - val_accuracy: 0.8474
Epoch 39/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4754 - accuracy: 0.7510 - val_loss: 1.0060 - val_accuracy: 0.8017
Epoch 40/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4693 - accuracy: 0.7538 - val_loss: 0.9398 - val_accuracy: 0.8317
Epoch 41/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4470 - accuracy: 0.7657 - val_loss: 1.1675 - val_accuracy: 0.7526
Epoch 42/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4454 - accuracy: 0.7634 - val_loss: 0.8636 - val_accuracy: 0.8557
Epoch 43/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4312 - accuracy: 0.7719 - val_loss: 1.0297 - val_accuracy: 0.7922
Epoch 44/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4186 - accuracy: 0.7795 - val_loss: 0.8596 - val_accuracy: 0.8643
Epoch 45/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4018 - accuracy: 0.7839 - val_loss: 0.9041 - val_accuracy: 0.8446
Epoch 46/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3946 - accuracy: 0.7890 - val_loss: 0.8905 - val_accuracy: 0.8462
Epoch 47/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3754 - accuracy: 0.7962 - val_loss: 0.9186 - val_accuracy: 0.8391
Epoch 48/200
313/313 [==============================] - 71s 227ms/step - loss: 1.3649 - accuracy: 0.8000 - val_loss: 0.8122 - val_accuracy: 0.8799
Epoch 49/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3455 - accuracy: 0.8086 - val_loss: 0.7993 - val_accuracy: 0.8763
Epoch 50/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3288 - accuracy: 0.8136 - val_loss: 0.8420 - val_accuracy: 0.8673
Epoch 51/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3037 - accuracy: 0.8206 - val_loss: 0.7862 - val_accuracy: 0.8836
Epoch 52/200
313/313 [==============================] - 70s 225ms/step - loss: 1.2853 - accuracy: 0.8270 - val_loss: 0.8425 - val_accuracy: 0.8686
Epoch 53/200
313/313 [==============================] - ETA: 0s - loss: 1.2612 - accuracy: 0.8360INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_52_0.9002000093460083/assets
313/313 [==============================] - 96s 308ms/step - loss: 1.2612 - accuracy: 0.8360 - val_loss: 0.7316 - val_accuracy: 0.9002
Epoch 54/200
313/313 [==============================] - ETA: 0s - loss: 1.2361 - accuracy: 0.8368INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_53_0.9024999737739563/assets
313/313 [==============================] - 97s 310ms/step - loss: 1.2361 - accuracy: 0.8368 - val_loss: 0.7200 - val_accuracy: 0.9025
Epoch 55/200
313/313 [==============================] - ETA: 0s - loss: 1.2176 - accuracy: 0.8469INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_54_0.911300003528595/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.2176 - accuracy: 0.8469 - val_loss: 0.6814 - val_accuracy: 0.9113
Epoch 56/200
313/313 [==============================] - ETA: 0s - loss: 1.1931 - accuracy: 0.8529INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_55_0.9157999753952026/assets
313/313 [==============================] - 97s 309ms/step - loss: 1.1931 - accuracy: 0.8529 - val_loss: 0.6546 - val_accuracy: 0.9158
Epoch 57/200
313/313 [==============================] - ETA: 0s - loss: 1.1652 - accuracy: 0.8623INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_56_0.9157000184059143/assets
313/313 [==============================] - 98s 313ms/step - loss: 1.1652 - accuracy: 0.8623 - val_loss: 0.6637 - val_accuracy: 0.9157
Epoch 58/200
313/313 [==============================] - ETA: 0s - loss: 1.1368 - accuracy: 0.8671INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_57_0.9140999913215637/assets
313/313 [==============================] - 97s 310ms/step - loss: 1.1368 - accuracy: 0.8671 - val_loss: 0.6597 - val_accuracy: 0.9141
Epoch 59/200
313/313 [==============================] - ETA: 0s - loss: 1.1064 - accuracy: 0.8752INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_58_0.9182999730110168/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.1064 - accuracy: 0.8752 - val_loss: 0.6294 - val_accuracy: 0.9183
Epoch 60/200
313/313 [==============================] - ETA: 0s - loss: 1.0853 - accuracy: 0.8789INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_59_0.9275000095367432/assets
313/313 [==============================] - 99s 316ms/step - loss: 1.0853 - accuracy: 0.8789 - val_loss: 0.6033 - val_accuracy: 0.9275
Epoch 61/200
313/313 [==============================] - ETA: 0s - loss: 1.0603 - accuracy: 0.8853INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_60_0.9294999837875366/assets
313/313 [==============================] - 98s 314ms/step - loss: 1.0603 - accuracy: 0.8853 - val_loss: 0.5923 - val_accuracy: 0.9295
Epoch 62/200
313/313 [==============================] - ETA: 0s - loss: 1.0352 - accuracy: 0.8923INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_61_0.9301999807357788/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.0352 - accuracy: 0.8923 - val_loss: 0.5777 - val_accuracy: 0.9302
Epoch 63/200
313/313 [==============================] - ETA: 0s - loss: 1.0171 - accuracy: 0.8989INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_62_0.9350000023841858/assets
313/313 [==============================] - 98s 314ms/step - loss: 1.0171 - accuracy: 0.8989 - val_loss: 0.5639 - val_accuracy: 0.9350
Epoch 64/200
313/313 [==============================] - ETA: 0s - loss: 1.0012 - accuracy: 0.8988INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_63_0.935699999332428/assets
313/313 [==============================] - 98s 315ms/step - loss: 1.0012 - accuracy: 0.8988 - val_loss: 0.5599 - val_accuracy: 0.9357
Epoch 65/200
313/313 [==============================] - ETA: 0s - loss: 0.9847 - accuracy: 0.9033INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_64_0.9365000128746033/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.9847 - accuracy: 0.9033 - val_loss: 0.5476 - val_accuracy: 0.9365
Epoch 66/200
313/313 [==============================] - ETA: 0s - loss: 0.9751 - accuracy: 0.9064INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_65_0.9404000043869019/assets
313/313 [==============================] - 97s 312ms/step - loss: 0.9751 - accuracy: 0.9064 - val_loss: 0.5355 - val_accuracy: 0.9404
Epoch 67/200
313/313 [==============================] - ETA: 0s - loss: 0.9655 - accuracy: 0.9109INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_66_0.9405999779701233/assets
313/313 [==============================] - 97s 311ms/step - loss: 0.9655 - accuracy: 0.9109 - val_loss: 0.5360 - val_accuracy: 0.9406
Epoch 68/200
313/313 [==============================] - ETA: 0s - loss: 0.9588 - accuracy: 0.9107INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_67_0.9419000148773193/assets
313/313 [==============================] - 97s 311ms/step - loss: 0.9588 - accuracy: 0.9107 - val_loss: 0.5299 - val_accuracy: 0.9419
Epoch 69/200
313/313 [==============================] - ETA: 0s - loss: 0.9553 - accuracy: 0.9141INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_68_0.9415000081062317/assets
313/313 [==============================] - 98s 313ms/step - loss: 0.9553 - accuracy: 0.9141 - val_loss: 0.5301 - val_accuracy: 0.9415
Epoch 70/200
313/313 [==============================] - ETA: 0s - loss: 0.9601 - accuracy: 0.9111INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_69_0.9417999982833862/assets
313/313 [==============================] - 97s 310ms/step - loss: 0.9601 - accuracy: 0.9111 - val_loss: 0.5304 - val_accuracy: 0.9418
Epoch 71/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5781 - accuracy: 0.7153 - val_loss: 1.6228 - val_accuracy: 0.6120
Epoch 72/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5612 - accuracy: 0.7518 - val_loss: 1.1625 - val_accuracy: 0.7944
Epoch 73/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5560 - accuracy: 0.7592 - val_loss: 1.0231 - val_accuracy: 0.8367
Epoch 74/200
313/313 [==============================] - 71s 225ms/step - loss: 1.5502 - accuracy: 0.7682 - val_loss: 0.9897 - val_accuracy: 0.8541
Epoch 75/200
313/313 [==============================] - 71s 225ms/step - loss: 1.5368 - accuracy: 0.7736 - val_loss: 1.0604 - val_accuracy: 0.8259
Epoch 76/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5367 - accuracy: 0.7729 - val_loss: 1.0381 - val_accuracy: 0.8422
Epoch 77/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5235 - accuracy: 0.7786 - val_loss: 0.9913 - val_accuracy: 0.8532
Epoch 78/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5313 - accuracy: 0.7774 - val_loss: 1.0033 - val_accuracy: 0.8543
Epoch 79/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5268 - accuracy: 0.7793 - val_loss: 1.0990 - val_accuracy: 0.8211
Epoch 80/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5208 - accuracy: 0.7807 - val_loss: 1.0095 - val_accuracy: 0.8486
Epoch 81/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5205 - accuracy: 0.7847 - val_loss: 1.0359 - val_accuracy: 0.8402
Epoch 82/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5230 - accuracy: 0.7818 - val_loss: 0.9664 - val_accuracy: 0.8678
Epoch 83/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5110 - accuracy: 0.7857 - val_loss: 0.9923 - val_accuracy: 0.8515
Epoch 84/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5135 - accuracy: 0.7856 - val_loss: 1.1525 - val_accuracy: 0.8043
Epoch 85/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5120 - accuracy: 0.7897 - val_loss: 1.1484 - val_accuracy: 0.8053
Epoch 86/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5002 - accuracy: 0.7931 - val_loss: 0.9652 - val_accuracy: 0.8613
Epoch 87/200
313/313 [==============================] - 71s 226ms/step - loss: 1.5007 - accuracy: 0.7936 - val_loss: 1.1770 - val_accuracy: 0.7940
Epoch 88/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4939 - accuracy: 0.7951 - val_loss: 0.9712 - val_accuracy: 0.8622
Epoch 89/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4937 - accuracy: 0.7930 - val_loss: 0.9509 - val_accuracy: 0.8762
Epoch 90/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4862 - accuracy: 0.7997 - val_loss: 1.0535 - val_accuracy: 0.8401
Epoch 91/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4832 - accuracy: 0.8018 - val_loss: 0.9751 - val_accuracy: 0.8612
Epoch 92/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4762 - accuracy: 0.8002 - val_loss: 0.9711 - val_accuracy: 0.8660
Epoch 93/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4736 - accuracy: 0.8018 - val_loss: 1.0282 - val_accuracy: 0.8515
Epoch 94/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4670 - accuracy: 0.8074 - val_loss: 1.0016 - val_accuracy: 0.8566
Epoch 95/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4652 - accuracy: 0.8085 - val_loss: 0.9505 - val_accuracy: 0.8743
Epoch 96/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4600 - accuracy: 0.8080 - val_loss: 0.9528 - val_accuracy: 0.8682
Epoch 97/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4541 - accuracy: 0.8102 - val_loss: 1.0619 - val_accuracy: 0.8392
Epoch 98/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4425 - accuracy: 0.8140 - val_loss: 1.1284 - val_accuracy: 0.8184
Epoch 99/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4381 - accuracy: 0.8158 - val_loss: 0.9954 - val_accuracy: 0.8568
Epoch 100/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4287 - accuracy: 0.8191 - val_loss: 0.9008 - val_accuracy: 0.8837
Epoch 101/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4277 - accuracy: 0.8201 - val_loss: 0.9159 - val_accuracy: 0.8774
Epoch 102/200
313/313 [==============================] - 71s 226ms/step - loss: 1.4144 - accuracy: 0.8252 - val_loss: 0.8854 - val_accuracy: 0.8892
Epoch 103/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4029 - accuracy: 0.8252 - val_loss: 0.9474 - val_accuracy: 0.8729
Epoch 104/200
313/313 [==============================] - 71s 227ms/step - loss: 1.4005 - accuracy: 0.8253 - val_loss: 0.9212 - val_accuracy: 0.8791
Epoch 105/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3857 - accuracy: 0.8321 - val_loss: 0.9114 - val_accuracy: 0.8814
Epoch 106/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3808 - accuracy: 0.8313 - val_loss: 0.9064 - val_accuracy: 0.8873
Epoch 107/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3679 - accuracy: 0.8344 - val_loss: 0.9124 - val_accuracy: 0.8805
Epoch 108/200
313/313 [==============================] - 71s 227ms/step - loss: 1.3534 - accuracy: 0.8390 - val_loss: 0.9156 - val_accuracy: 0.8739
Epoch 109/200
313/313 [==============================] - 71s 227ms/step - loss: 1.3417 - accuracy: 0.8414 - val_loss: 0.8714 - val_accuracy: 0.8870
Epoch 110/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3297 - accuracy: 0.8450 - val_loss: 0.9895 - val_accuracy: 0.8652
Epoch 111/200
313/313 [==============================] - ETA: 0s - loss: 1.3278 - accuracy: 0.8454INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_110_0.9075000286102295/assets
313/313 [==============================] - 98s 315ms/step - loss: 1.3278 - accuracy: 0.8454 - val_loss: 0.8085 - val_accuracy: 0.9075
Epoch 112/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3083 - accuracy: 0.8492 - val_loss: 0.8284 - val_accuracy: 0.8975
Epoch 113/200
313/313 [==============================] - 71s 226ms/step - loss: 1.3000 - accuracy: 0.8542 - val_loss: 0.8433 - val_accuracy: 0.8949
Epoch 114/200
313/313 [==============================] - 71s 226ms/step - loss: 1.2826 - accuracy: 0.8559 - val_loss: 0.8935 - val_accuracy: 0.8797
Epoch 115/200
313/313 [==============================] - ETA: 0s - loss: 1.2633 - accuracy: 0.8586INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_114_0.9126999974250793/assets
313/313 [==============================] - 98s 313ms/step - loss: 1.2633 - accuracy: 0.8586 - val_loss: 0.7661 - val_accuracy: 0.9127
Epoch 116/200
313/313 [==============================] - ETA: 0s - loss: 1.2533 - accuracy: 0.8612INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_115_0.9157000184059143/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.2533 - accuracy: 0.8612 - val_loss: 0.7491 - val_accuracy: 0.9157
Epoch 117/200
313/313 [==============================] - ETA: 0s - loss: 1.2394 - accuracy: 0.8683INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_116_0.9071000218391418/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.2394 - accuracy: 0.8683 - val_loss: 0.7988 - val_accuracy: 0.9071
Epoch 118/200
313/313 [==============================] - ETA: 0s - loss: 1.2213 - accuracy: 0.8692INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_117_0.906499981880188/assets
313/313 [==============================] - 98s 314ms/step - loss: 1.2213 - accuracy: 0.8692 - val_loss: 0.7806 - val_accuracy: 0.9065
Epoch 119/200
313/313 [==============================] - 71s 227ms/step - loss: 1.2070 - accuracy: 0.8717 - val_loss: 0.8145 - val_accuracy: 0.8972
Epoch 120/200
313/313 [==============================] - ETA: 0s - loss: 1.1875 - accuracy: 0.8756INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_119_0.9179999828338623/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.1875 - accuracy: 0.8756 - val_loss: 0.7212 - val_accuracy: 0.9180
Epoch 121/200
313/313 [==============================] - ETA: 0s - loss: 1.1792 - accuracy: 0.8794INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_120_0.9228000044822693/assets
313/313 [==============================] - 98s 314ms/step - loss: 1.1792 - accuracy: 0.8794 - val_loss: 0.7093 - val_accuracy: 0.9228
Epoch 122/200
313/313 [==============================] - ETA: 0s - loss: 1.1559 - accuracy: 0.8840INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_121_0.9199000000953674/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.1559 - accuracy: 0.8840 - val_loss: 0.6997 - val_accuracy: 0.9199
Epoch 123/200
313/313 [==============================] - ETA: 0s - loss: 1.1458 - accuracy: 0.8836INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_122_0.9251000285148621/assets
313/313 [==============================] - 98s 313ms/step - loss: 1.1458 - accuracy: 0.8836 - val_loss: 0.6774 - val_accuracy: 0.9251
Epoch 124/200
313/313 [==============================] - ETA: 0s - loss: 1.1286 - accuracy: 0.8882INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_123_0.9228000044822693/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.1286 - accuracy: 0.8882 - val_loss: 0.6929 - val_accuracy: 0.9228
Epoch 125/200
313/313 [==============================] - ETA: 0s - loss: 1.1120 - accuracy: 0.8922INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_124_0.9168999791145325/assets
313/313 [==============================] - 98s 313ms/step - loss: 1.1120 - accuracy: 0.8922 - val_loss: 0.6840 - val_accuracy: 0.9169
Epoch 126/200
313/313 [==============================] - ETA: 0s - loss: 1.0933 - accuracy: 0.8975INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_125_0.9311000108718872/assets
313/313 [==============================] - 98s 314ms/step - loss: 1.0933 - accuracy: 0.8975 - val_loss: 0.6414 - val_accuracy: 0.9311
Epoch 127/200
313/313 [==============================] - ETA: 0s - loss: 1.0792 - accuracy: 0.8999INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_126_0.9251999855041504/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.0792 - accuracy: 0.8999 - val_loss: 0.6491 - val_accuracy: 0.9252
Epoch 128/200
313/313 [==============================] - ETA: 0s - loss: 1.0653 - accuracy: 0.8998INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_127_0.9205999970436096/assets
313/313 [==============================] - 98s 313ms/step - loss: 1.0653 - accuracy: 0.8998 - val_loss: 0.6791 - val_accuracy: 0.9206
Epoch 129/200
313/313 [==============================] - ETA: 0s - loss: 1.0465 - accuracy: 0.9038INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_128_0.926800012588501/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.0465 - accuracy: 0.9038 - val_loss: 0.6243 - val_accuracy: 0.9268
Epoch 130/200
313/313 [==============================] - ETA: 0s - loss: 1.0328 - accuracy: 0.9061INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_129_0.9325000047683716/assets
313/313 [==============================] - 98s 312ms/step - loss: 1.0328 - accuracy: 0.9061 - val_loss: 0.5995 - val_accuracy: 0.9325
Epoch 131/200
313/313 [==============================] - ETA: 0s - loss: 1.0150 - accuracy: 0.9127INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_130_0.9314000010490417/assets
313/313 [==============================] - 98s 314ms/step - loss: 1.0150 - accuracy: 0.9127 - val_loss: 0.5954 - val_accuracy: 0.9314
Epoch 132/200
313/313 [==============================] - ETA: 0s - loss: 1.0041 - accuracy: 0.9136INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_131_0.9333999752998352/assets
313/313 [==============================] - 97s 311ms/step - loss: 1.0041 - accuracy: 0.9136 - val_loss: 0.5837 - val_accuracy: 0.9334
Epoch 133/200
313/313 [==============================] - ETA: 0s - loss: 0.9852 - accuracy: 0.9168INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_132_0.9358000159263611/assets
313/313 [==============================] - 98s 313ms/step - loss: 0.9852 - accuracy: 0.9168 - val_loss: 0.5633 - val_accuracy: 0.9358
Epoch 134/200
313/313 [==============================] - ETA: 0s - loss: 0.9751 - accuracy: 0.9218INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_133_0.9391999840736389/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.9751 - accuracy: 0.9218 - val_loss: 0.5496 - val_accuracy: 0.9392
Epoch 135/200
313/313 [==============================] - ETA: 0s - loss: 0.9620 - accuracy: 0.9222INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_134_0.9416999816894531/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.9620 - accuracy: 0.9222 - val_loss: 0.5408 - val_accuracy: 0.9417
Epoch 136/200
313/313 [==============================] - ETA: 0s - loss: 0.9519 - accuracy: 0.9222INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_135_0.9390000104904175/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.9519 - accuracy: 0.9222 - val_loss: 0.5450 - val_accuracy: 0.9390
Epoch 137/200
313/313 [==============================] - ETA: 0s - loss: 0.9401 - accuracy: 0.9265INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_136_0.9412999749183655/assets
313/313 [==============================] - 97s 311ms/step - loss: 0.9401 - accuracy: 0.9265 - val_loss: 0.5297 - val_accuracy: 0.9413
Epoch 138/200
313/313 [==============================] - ETA: 0s - loss: 0.9270 - accuracy: 0.9283INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_137_0.9413999915122986/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.9270 - accuracy: 0.9283 - val_loss: 0.5209 - val_accuracy: 0.9414
Epoch 139/200
313/313 [==============================] - ETA: 0s - loss: 0.9222 - accuracy: 0.9300INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_138_0.9416999816894531/assets
313/313 [==============================] - 98s 313ms/step - loss: 0.9222 - accuracy: 0.9300 - val_loss: 0.5221 - val_accuracy: 0.9417
Epoch 140/200
313/313 [==============================] - ETA: 0s - loss: 0.9115 - accuracy: 0.9297INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_139_0.944100022315979/assets
313/313 [==============================] - 98s 313ms/step - loss: 0.9115 - accuracy: 0.9297 - val_loss: 0.5109 - val_accuracy: 0.9441
Epoch 141/200
313/313 [==============================] - ETA: 0s - loss: 0.9063 - accuracy: 0.9319INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_140_0.9448999762535095/assets
313/313 [==============================] - 97s 311ms/step - loss: 0.9063 - accuracy: 0.9319 - val_loss: 0.5023 - val_accuracy: 0.9449
Epoch 142/200
313/313 [==============================] - ETA: 0s - loss: 0.9031 - accuracy: 0.9334INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_141_0.9441999793052673/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.9031 - accuracy: 0.9334 - val_loss: 0.4998 - val_accuracy: 0.9442
Epoch 143/200
313/313 [==============================] - ETA: 0s - loss: 0.8962 - accuracy: 0.9319INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_142_0.944100022315979/assets
313/313 [==============================] - 98s 312ms/step - loss: 0.8962 - accuracy: 0.9319 - val_loss: 0.4981 - val_accuracy: 0.9441
Epoch 144/200
313/313 [==============================] - ETA: 0s - loss: 0.8942 - accuracy: 0.9357INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_143_0.9445000290870667/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.8942 - accuracy: 0.9357 - val_loss: 0.4956 - val_accuracy: 0.9445
Epoch 145/200
313/313 [==============================] - ETA: 0s - loss: 0.8864 - accuracy: 0.9356INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_144_0.944100022315979/assets
313/313 [==============================] - 98s 315ms/step - loss: 0.8864 - accuracy: 0.9356 - val_loss: 0.4931 - val_accuracy: 0.9441
Epoch 146/200
313/313 [==============================] - ETA: 0s - loss: 0.8859 - accuracy: 0.9345INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_145_0.9445000290870667/assets
313/313 [==============================] - 97s 310ms/step - loss: 0.8859 - accuracy: 0.9345 - val_loss: 0.4897 - val_accuracy: 0.9445
Epoch 147/200
313/313 [==============================] - ETA: 0s - loss: 0.8852 - accuracy: 0.9369INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_146_0.9458000063896179/assets
313/313 [==============================] - 98s 314ms/step - loss: 0.8852 - accuracy: 0.9369 - val_loss: 0.4889 - val_accuracy: 0.9458
Epoch 148/200
313/313 [==============================] - ETA: 0s - loss: 0.8840 - accuracy: 0.9362INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_147_0.9448999762535095/assets
313/313 [==============================] - 97s 311ms/step - loss: 0.8840 - accuracy: 0.9362 - val_loss: 0.4894 - val_accuracy: 0.9449
Epoch 149/200
313/313 [==============================] - ETA: 0s - loss: 0.8840 - accuracy: 0.9354INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_148_0.9447000026702881/assets
313/313 [==============================] - 99s 316ms/step - loss: 0.8840 - accuracy: 0.9354 - val_loss: 0.4895 - val_accuracy: 0.9447
Epoch 150/200
313/313 [==============================] - ETA: 0s - loss: 0.8827 - accuracy: 0.9367INFO:tensorflow:Assets written to: /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/ensemble_149_0.9447000026702881/assets
313/313 [==============================] - 98s 312ms/step - loss: 0.8827 - accuracy: 0.9367 - val_loss: 0.4892 - val_accuracy: 0.9447
Epoch 151/200
313/313 [==============================] - 72s 229ms/step - loss: 1.6464 - accuracy: 0.7064 - val_loss: 1.7352 - val_accuracy: 0.6159
Epoch 152/200
313/313 [==============================] - 71s 228ms/step - loss: 1.6015 - accuracy: 0.7585 - val_loss: 1.2176 - val_accuracy: 0.7897
Epoch 153/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5834 - accuracy: 0.7742 - val_loss: 1.1513 - val_accuracy: 0.8120
Epoch 154/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5802 - accuracy: 0.7838 - val_loss: 1.0583 - val_accuracy: 0.8549
Epoch 155/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5736 - accuracy: 0.7843 - val_loss: 1.1609 - val_accuracy: 0.8191
Epoch 156/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5708 - accuracy: 0.7917 - val_loss: 1.1424 - val_accuracy: 0.8296
Epoch 157/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5626 - accuracy: 0.7922 - val_loss: 1.0871 - val_accuracy: 0.8519
Epoch 158/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5660 - accuracy: 0.7926 - val_loss: 0.9849 - val_accuracy: 0.8822
Epoch 159/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5591 - accuracy: 0.7944 - val_loss: 1.0818 - val_accuracy: 0.8516
Epoch 160/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5616 - accuracy: 0.7908 - val_loss: 1.0154 - val_accuracy: 0.8657
Epoch 161/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5544 - accuracy: 0.7991 - val_loss: 1.1372 - val_accuracy: 0.8336
Epoch 162/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5586 - accuracy: 0.7956 - val_loss: 1.0784 - val_accuracy: 0.8494
Epoch 163/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5540 - accuracy: 0.7992 - val_loss: 1.0436 - val_accuracy: 0.8658
Epoch 164/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5609 - accuracy: 0.7951 - val_loss: 1.0680 - val_accuracy: 0.8549
Epoch 165/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5548 - accuracy: 0.8016 - val_loss: 1.0933 - val_accuracy: 0.8543
Epoch 166/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5524 - accuracy: 0.7990 - val_loss: 1.1609 - val_accuracy: 0.8264
Epoch 167/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5551 - accuracy: 0.7993 - val_loss: 1.1800 - val_accuracy: 0.8235
Epoch 168/200
313/313 [==============================] - 72s 228ms/step - loss: 1.5491 - accuracy: 0.8015 - val_loss: 0.9919 - val_accuracy: 0.8771
Epoch 169/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5563 - accuracy: 0.8002 - val_loss: 1.0459 - val_accuracy: 0.8696
Epoch 170/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5463 - accuracy: 0.8004 - val_loss: 1.1034 - val_accuracy: 0.8468
Epoch 171/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5465 - accuracy: 0.8032 - val_loss: 1.0031 - val_accuracy: 0.8747
Epoch 172/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5454 - accuracy: 0.8017 - val_loss: 1.0445 - val_accuracy: 0.8615
Epoch 173/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5447 - accuracy: 0.8023 - val_loss: 1.0124 - val_accuracy: 0.8711
Epoch 174/200
313/313 [==============================] - 71s 228ms/step - loss: 1.5421 - accuracy: 0.8026 - val_loss: 1.0204 - val_accuracy: 0.8697
Epoch 175/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5393 - accuracy: 0.8030 - val_loss: 1.0588 - val_accuracy: 0.8645
Epoch 176/200
313/313 [==============================] - 71s 227ms/step - loss: 1.5357 - accuracy: 0.8039 - val_loss: 1.0022 - val_accuracy: 0.8778
Epoch 177/200
243/313 [======================>.......] - ETA: 14s - loss: 1.5300 - accuracy: 0.8083
Halting Training
Saving best model to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_Cutmix_Ensemble
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
display(results)
fig.show()
Epochs 156 Batch Size 128 Model Name SEWRN_Cutmix_Ensemble Remarks Model Path /content/drive/MyDrive/Data/DELE CA1/CIFAR10/S... Train Loss 0.885244 Test Loss 0.48893 Train Acc 0.936875 Test Acc 0.9458 [Train - Test] Acc -0.00892502 dtype: object
evaluator.save_history()
History saved to /content/drive/MyDrive/Data/DELE CA1/CIFAR10/history.csv
names = [
"ensemble_146_0.9458000063896179",
"ensemble_69_0.9417999982833862",
"ensemble_29_0.9243000149726868"
]
models = [
tf.keras.models.load_model(f"/content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/{name}/") for name in names
]
To ensemble our models, I pick three checkpoints which are from before each warm restart, and load them. Their predictions are averaged and used to make a final prediction.
def ensemble(models, X):
predicted_ensemble = [
model.predict(X)for model in models
]
return np.argmax(np.mean(predicted_ensemble, axis=0), axis=1)
ensemble_results = ensemble(models, X_val)
y_val_int = np.argmax(y_val, axis=1)
from sklearn.metrics import accuracy_score
accuracy_score(y_val_int, ensemble_results)
0.9502
We can see that ensembling the three models together results in a 95% validation accuracy. This does come at a cost to computation time during inference. However, given that we are not aiming for speed here, the improved validation accuracy helps us achieve our goal of having a more robust model.
The final model is as follows:
evaluator.return_history().sort_values("Test Acc")
| Model Name | Train Acc | Test Acc | [Train - Test] Acc | Remarks | |
|---|---|---|---|---|---|
| 12 | SEWRN_28_10_SWA_Cutmix | 0.196825 | 0.2231 | -0.026275 | NaN |
| 3 | efficientnetv2-s | 0.320450 | 0.3495 | -0.029050 | NaN |
| 0 | Baseline_MLP | 0.631750 | 0.5247 | 0.107050 | High Variance |
| 11 | SEWRN_28_10_Fixed_Cutmix_Mish | 0.639800 | 0.7739 | -0.134100 | NaN |
| 1 | Baseline_CNN_1 | 0.999825 | 0.8665 | 0.133325 | Low Avoidable Bias but Overfits |
| 4 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.997800 | 0.8739 | 0.123900 | NaN |
| 2 | Baseline_CNN_1_DataAug | 0.996625 | 0.8826 | 0.114025 | Data Aug lowers variance, but still overfits |
| 7 | ImprovedWideResNet_28_10_Dropout_No_Stochastic... | 0.999850 | 0.9170 | 0.082850 | NaN |
| 8 | WideResNet_28_10_Fixed_BasicDataAug | 0.998625 | 0.9333 | 0.065325 | NaN |
| 9 | WideResNet_28_10_Fixed_CutMix | 0.916025 | 0.9391 | -0.023075 | NaN |
| 5 | ImprovedWideResNet_28_10_No_Stochastic_Depth_B... | 0.999425 | 0.9398 | 0.059625 | NaN |
| 6 | ImprovedWideResNet_28_10_ProperDropout_No_Stoc... | 0.998900 | 0.9456 | 0.053300 | Best model thus far |
| 13 | SEWRN_Cutmix_Ensemble | 0.936875 | 0.9458 | -0.008925 | |
| 10 | SEWRN_28_10_Fixed_Cutmix | 0.929250 | 0.9461 | -0.016850 | NaN |
To ensure the model generalizes well, I evaluate it on the testing set.
# final_model = load_model("/content/drive/MyDrive/Data/DELE CA1/CIFAR10/SavedModels/SEWRN_28_10_Fixed_Cutmix")
y_test_int = np.argmax(y_test, axis=1)
y_pred = ensemble(models, X_test)
# final_model.evaluate(X_test, y_test)
accuracy_score(y_test_int, y_pred)
0.946
As can be observed, the test set accuracy is very close to the validation set accuracy, so we can safely say that the model does not appear to be overfitting.
from sklearn.metrics import classification_report
report = classification_report(
y_test_int, y_pred, target_names=class_labels.values()
)
print(report)
precision recall f1-score support
airplane 0.96 0.95 0.95 1000
automobile 0.97 0.98 0.97 1000
bird 0.94 0.93 0.93 1000
cat 0.88 0.89 0.88 1000
deer 0.94 0.96 0.95 1000
dog 0.91 0.90 0.90 1000
frog 0.96 0.97 0.96 1000
horse 0.98 0.96 0.97 1000
ship 0.97 0.97 0.97 1000
truck 0.97 0.96 0.97 1000
accuracy 0.95 10000
macro avg 0.95 0.95 0.95 10000
weighted avg 0.95 0.95 0.95 10000
From the classification report, we see that
I try to gain a better understanding of the examples where the model does badly.
wrong_example_mask = y_test_int != y_pred
X_test_wrong = X_test[wrong_example_mask]
y_test_wrong = y_test_int[wrong_example_mask]
y_pred_wrong = y_pred[wrong_example_mask]
random_idxs = np.random.choice(X_test_wrong.shape[0], 20, replace=False)
fig, ax = plt.subplots(4, 5, figsize=(20, 20))
plt.axis("off")
for idx, subplot in zip(random_idxs, ax.ravel()):
pred = class_labels[y_pred_wrong[idx]]
actual = class_labels[y_test_wrong[idx]]
subplot.imshow(X_test_wrong[idx], cmap='gray')
subplot.set_title(f"Label: {actual}, Predicted: {pred}")
In some of these images (cat vs dog), the low resolution makes it unclear whether the picture contains a cat or dog as both have similar shapes. So this explains why the model doesn't do as well for cats and dogs.